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[57] ABSTRACT 

Parallel data-structures distribute a given data set to system 
components by grouping the data set according to ranges. 
These ranges are sub-divided for distribution into parallel 
form. A given data value is located by its placement within 
an appropriate range; the ranges are located by their rela- 
tionships to each other and the data set as a whole; thus, the 
ranges are related to each other, the order of the data set is 
maintained and access is gained to the data set by range. 
Each range may be distributed to multiple nodes; each node 
may be contained in a separate data-structure; each separate 
data-structure may be maintained on a separate system 
component. The result is a method of creating and using 
parallel data-structures that may take a wide variety of forms 
and be used to control data distribution and the efficient 
distribution of system resources. 
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METHOD FOR CREATING AND USING 
PARALLEL DATA STRUCTURES 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims priority under 35 U.S.C § 11.9(e) 
to the following related provisional applications: Ser. No. 
60/023,340, filed Jul. 25, 1996 and Ser. No. 60/022,616, 
filed Jul. 26, 1996. 

BACKGROUND OF PROBLEM AND 
SOLUTION 

In recent years, the need for more computational power 
and speed in computer systems has lead to the use of 
multiple processors to perform computational tasks. The 
processors work cooperatively, sharing resources and dis- 
tributing work amongst themselves by sending data over 
communication lines or through shared memory. This prac- 
tice of utilizing multi-processors to accomplish a single task 
is known as parallel processing or distributed processing. 
Although the terms parallel and distributed may describe 
distinct forms of multi-processing, they are in essence 
synonymous. The problems described and solved by the 
present invention apply equally to parallel and distributed 
processing. In addition, these problems/solutions apply to 
any component or aspect of a computer system to which 
work may be distributed amongst multiple components, 
even in non-parallel systems: one such "non-parallel" appli- 
cation described herein is the use of the present invention to 
manage dynamic access storage devices (DASD) such as 
disk drives [see section Rules for Fullness and Ordering 
Scheme for B-trees Stored on Disk]. 

Dividing work amongst multi-processors in such a way 
that the work is divided evenly and performed in an efficient 
manner is the goal of parallel/distributed processing, and 
dividing work amongst multiple system components equally 
and efficiently is also desirable in sequential (single- 
processor) systems. Many well-known sequential methods, 
systems or processes exist to efficiently perform computa- 
tional tasks (sorting, merging, etc.). Their parallel counter- 
parts have yet to be invented. Some new parallel methods 
are parallelized versions of existing sequential methods, for 
example, the parallel recursive merge -sort [see "Introduc- 
tion to Parallel Methods" by Joseph JaJa, Addison-Wesley, 
1992]. The invention described herein is a method of cre- 
ating parallel data-structures. The preferred embodiment of 
the present invention parallelizes single-processor, ordered 
list methods to efficiently distribute the work and storage for 
ordered list maintenance amongst multiple processors, pro- 
cessing components and/or storage locations: this ordered 
fist maintenance is carried out through adapted versions of 
single-processor data-structures expressed as graphs 
(B-trees, AVL trees, linked-lists, m-way trees, heaps, etc.). 

DISCUSSION OF PROBLEM AND PREFERRED 
EMBODIMENT 

The goal in parallel processing is to utilize a number of 
processors (P) to increase the system* s speed and power by 
a factor of P: optimally, a task requiring time T on a 
single-processor can be accomplished in time T/P on P 
processors. The problem is the even distribution of work 
amongst the P processors. Many new methods have arisen 
from the field to efficiently distribute the work for standard 
computational tasks (e.g. sorting, merging, etc.). One stan- 
dard task is the maintenance of ordered lists of data: many 
methods exist for single-processor systems to accomplish 



.8,123 

2 

ordered list maintenance; the problem described and solved 
herein is the efficient distribution of work amongst multi- 
processors to accomplish efficient ordered list maintenance. 
(The term "ordered list" includes many data -structures: 

5 sorted lists, heaps, stacks, trees, etc.). 

In general (regardless of the type of system keeping the 
lists), the maintenance of ordered lists consists of two basic 
operations: InsertQ and RemoveQ (Search()/Find() is 
implied.). Insertion into the lists requires that an element of 

1Q data be added to the list and that its position within the list 
be defined. Assuming the ordered list {5,12,46,67,80,99}, 
the Insertion (Insert(x)) of the numeric element 35 (Insert 
(35)) results in the list {5,12,35,46,67,80,99}. Removal 
(Remove(x)) of an element can take several forms: removal 
by location, by value, by range of values, etc. Again, 

15 assuming the list {5,12,46,67,80,99}, Removal of the fourth 
(4th) element results in {5,12,46,80,99}, Removal of the 
value 12 results in {5,46,67,80,99}, Removal of the smallest 
element greater than 50 results in {5,12,46,80,99}. The 
Remove operation is considered to return the value of the 

20 removed element for use, if present, or to return the infor- 
mation that a specific value is not contained in the list, if not 
present. 

The problem presented and solved in the preferred 
embodiment is the parallelizing of the list maintenance 
described above. The essential functioning of the list 
remains the same in the parallel version of the data-structure. 
The Insert(x) and Remove(x) operations produce the same 
results. However, on a single-processor system these opera- 
3Q tions are performed by one processor which can only Insert 
or Remove one element at a time; on a multi-processor 
system with P processors, the parallel version of the method 
can Insert and/or Remove P elements at a time as described 
below. 

35 Assuming a multi-processor system with 3 processors 
(P=3), and also assuming a list containing the elements 
{4,13,14,20,28,34,39,43,53,67,76,81} we have the follow- 
ing parallelized result: each processor keeps approximately 
one-third of the elements at any given time; each processor 

^ may Insert(x) into its own sub-list at any given time 
(possibly sending the element x to one of the other proces- 
sors for Insertion into one of the other sub-lists); each 
processor may Remove(x) from its sub-list at any time and 
may request that other processors attempt to locate element 

45 x in their sub-lists if x is not present in the original 
processor's sub- list; any other processor finding x in its 
sub-list then sends x to the original processor. 

The sub-lists are distributed in this example by cutting the 
list into equal thirds. This manner of distribution is for the 

50 purpose of a generalized example only. The Example given 
in this section is intended to introduce the reader to the 
problem in the most generalized manner possible; the 
Example here contains none of the specific details of the 
parallel method. 

S5 The Parallel List 

Processor #1 (PI) keeps one-third of the elements: 

Sub-list Sl-{4,13,14,20} 
Processor #2 (P2) keeps one-third of the elements: 
Sub-list S2={28,34,39,43} 

60 Processor #3 (P3) keeps one-third of the elements: 
Sub-list S3«{53,67,76,81} 
Insertion into the Parallel List 

The elements 72, 22, and 12 are to be Inserted. All three 
processors simultaneously perform the Insertion giving the 
65 results: 

Processor #1 (PI): Insert(72) — sends element 72 to P3, 
receives element 12 from P3 
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Sub-list Sl«{4,12,13,14,20} 
Processor #2 (P2): Insert(22y — inserts element 22 directly 
into its sub -list (S2) 

Sub-list S2-{22,28,34,39,43} 
Processor #3 (P3): Insert(12) — sends element 12 to PI, 5 
receives element 72 from PI 

Sub-list S3-{53,67,72,76,81} 
Removal from the Parallel List (List Contains Elements 
from Insertion Above) 

The values 37, 28 and 13 are to be found and Removed. 10 
All three processors simultaneously perform the Removal, 
giving the results: 

Processor #1 (PI): Remove(37) — requests another processor 
to find 37 and receives reply from P2 that the element 37 is 
not present, receives request for element 13 from P3 and 15 
Removes 13 from the list. 

Sub-list Sl={4,12,14,20} 
Processor #2 (P2): Remove(28) — removes 28 directly from 
the list, replies "37 not present" to PI 

Sub-list S2-{22,34,39,43} 20 
Processor #3 (P3): Remove(13) — requests another processor 
to find 13, receives 13 from PI 

Sub-list S3={53,67,72,76,81} 

It must be stressed that the example above is a generalized 25 
example intended to explain the basic logical functionality 
of the problem. The precise details and organization of 
parallelized lists are described in subsequent sections. 

The essential functioning of an ordered list is described 
above; however, many different forms of lists are used on 30 
modern systems, and many different types of data may be 
stored. Efficient methods/data-structures are used to main- 
tain such lists on single-processor systems: heaps, binary 
trees, AVL trees, B-trees, etc which are well known in the 
art. (For descriptions of such methods/data-structures see 35 
"File Structures Using Pascal" by Nancy Miller, The 
Benjamin/Cummin gs Publishing Co., Inc. (1987)). The 
methods used on modern systems were designed to function 
on single-processor systems efficiently. This efficiency is 
expressed by asymptotical time -complexity functions. The 40 
functions are generally expressed in terms of n in the form 
0(f(n)) [e.g. 0(log 2 n) or 0(n 2 )]. For the problem to be truly 
solved, a parallel version of a list maintenance method must 
distribute the work amongst the P processors efficiently so 
that the time -complexity approaches optimum improvement 45 
(speedup). Perfect speedup for a given parallelized method 
would be 0(f(n)/P). 

SUMMARY 

The present inv ention is a means to create parallel data - 50 
s tructures an d a ssociated m aintenance programs. Thedala^ 
st ructures_aa d^programs may ta ke a variety of formsj a 11 
using the same e ssential steps and component s. The parallel 
data-structures distribute a given data set to system compo- 
nents by grouping the data set according to ranges. These 55 
ranges are sub-divided for distribution into parallel form. A 
given data value is located by its placement within an 
appropriate range; the ranges are located by their relation- 
ships to each other and the data set as a whole; thus, as the 
ranges are related to each other, the order of the data set is eo 
maintained and access may be gained to the data set by 
range, and as the data values are related to the ranges, the 
data values themselves may be maintained as well. 

In order for a data set to change, the values or the 
relationships between the values must change. The present 65 
invention allows this change by altering the ranges or the 
relationships between the ranges and thereby altering the 
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values or relationships between values. Altering a range may 
alter the sub -set of data contained by the range, and this 
range alteration may then be used to re -distribute data values 
and maintain appropriate sizes and locations for the data 
sub -sets. The maintenance of the ranges, sub -sets and data 
value distribution within the sub -sets offers a wide variety of 
possible over- all distributions of data sets and methods of 
maintaining order. Some of these distributions and methods 
are parallel forms of serial data-structures. 

The present invention offers many advantages including: 
a flexible means to create a wide variety of parallel data- 
structures rather than simply defining a single instance of a 
particular parallel data-structure; flexible methods of dis- 
tributing data within a structure for efficiency; the ability to 
create parallel versions of serial data-structures that maintain 
the essential efficiency and express the essential form of the 
serial data structures without significant alteration of the 
principles or methods that underlie the serial data-structures. 

OBJECTS AND ADVANTAGES 

O ne object of the method of creating data-structures is to 
d istribute work and storage to multiple system compon ents . 
Th e method can accomplish the distribution of work by 
a llowing sirmdtane.nus acce ss to multiple paral lel npdes . 
graphs or indexes by multiple processing elements in a 
flexible ma nner If can acmmplistjilie distribution of st orage 

hy HktnhnfinfT multiple parallel nnHp. S to multiple Storage 

Igc^lioae: 

Another object is to provide the ability to distribute data 
more evenly. A data set with a skewed distribution may be 
more evenly distributed by breaking the data into sub-sets. 
Each sub-set may be distributed evenly while all of the 
sub-sets taken together still express the original distribution 
of the data set. 

An advantage of the method when used to transform serial 
data-structures into parallel form is that the original structure 
of the serial algorithm can be expressed without altering the 
essence of the algorithm. 

Another advantage is the wide range of possible structures 
created. Many serial data-structures may be adapted using 
the same principles as well as many new parallel data- 
structures created. 

Another advantage is the use of various components of 
the method to refine the functioning, data distribution, work 
distribution and efficiency of the data-structures and asso- 
ciated maintenance programs through the characteristics of 
the rules that support the various components. For only one 
example, see the Rules for Fullness and Ordering Scheme 
for B-trees Stored on Disk section contained herein. 

Still other objects and advantages will become apparent 
through a consideration of the other descriptions of the 
invention contained herein. 

BRIEF DESCRIPTION OF FIGURES 
FIG. 1 shows serial b-tree. 

FIG. 2 shows parallel b-tree on two processors with 
indication of G-node and P- nodes for preferred embodiment. 

FIG. 3 shows parallel b-tree of FIG. 2 after removal of one 
G-node. 

FIG. 4 shows serial AVL tree of Example 1 for preferred 
embodiment. 

FIG. 5 AVL tree of FIG. 4 after addition of element. 

FIG. 6 AVL tree of FIG. 4 after rotation. 

FIG. 7 AVL tree of FIG. 4 after another addition. 
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FIG. 8 AVL tree of FIG. 4 after another addition. 

FIG. 9 AVL tree of FIG. 4 after rotation. 

FIG. 10 AVL tree of FIG. 4 after removal of element. 

FIG. 11 parallel AVL tree of Example 1 for preferred 
embodiment, comprising 3 separate trees stored on 3 pro- 
cessors. 

FIG. 12 AVL tree of FIG. U after addition of element. 

FIG. 13 AVL tree of FIG. 11 after another addition. 

FIG. 14 shows redistribution of elements to maintain 
Ordering Scheme. 

FIG. 15 shows range split and redistribution of elements 
resulting in creation of new G-node (G-node Split) for 
Examples 1 and 2 of preferred embodiment. 

FIG. 16 AVL tree of FIG. 11 after insertion of G-node. 

FIG. 17 AVL tree of FIG. 11 after rotation. 

FIG. 18 AVL tree of FIG. 11 after another addition of 
elements. 

FIG. 19 shows redistribution of elements to maintain 
Ordering Scheme. 

FIG. 20 shows range split and redistribution of elements 
resulting in creation of new G-node (G-node Split) for 
Examples 1 and 2 of preferred embodiment. 

FIG. 21 AVL tree of FIG. 11 after insertion of G-node. 

FIG. 22 AVL tree of FIG. 11 after another addition of 
elements. 

FIG. 23 shows redistribution of elements to maintain 
Ordering Scheme. 

FIG. 24 shows range split and redistribution of elements 
resulting in creation of new G-node (G-node Split) for 
Examples 1 and 2 of preferred embodiment. 

FIG. 25 AVL tree of FIG. 11 after insertion of G-node. 

FIG. 26 AVL tree of FIG. 11 after rotation. 

FIG. 27 shows removal of elements from tree of FIG. 11. 

FIG. 28 is shows another removal of elements from tree 
of FIG. 11. 

FIG. 29 shows result of G-node removal from tree of FIG. 
11. 

FIG, 30 shows serial B-tree of Example 2 for preferred 
embodiment, comprising 3 separate trees stored on 3 pro- 
cessors. 

FIG. 31 B-tree of FIG. 30 after addition of element. 
FIG. 32 B-tree of FIG. 30 after b-tree node split. 
FIG. 33 B-tree of FIG. 30 after additional b-tree node 
split. 

FIG. 34 B-tree of FIG. 30 after another addition. 

FIG. 35 B-tree of FIG. 30 after another addition. 

FIG. 36 B-tree of FIG. 30 after b-tree node split. 

FIG. 37 B-tree of FIG. 30 after removal of element and 
b-tree node merge. 

FIG. 38 parallel B-tree of Example 2 for preferred 
embodiment. 

FIG. 39 B-tree of FIG. 38 after addition of element. 

FIG. 40 B-tree of FIG. 38 after another addition. 

FIG. 41 shows redistribution of elements to maintain 
Ordering Scheme. 

FIG. 42 B-tree of FIG. 38 after insertion of G-node. 

FIG. 43 B-tree of FIG. 38 after b-tree node split. 

FIG. 44 B-tree of FIG. 38 after additional b-tree node 
split. 

FIG. 45 B-tree of FIG. 38 after another addition of 
elements. 
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FIG. 46 shows redistribution of elements to maintain 
Ordering Scheme. 
FIG. 47 B-tree of FIG. 38 after insertion of G-node. 
FIG. 48 B-tree of FIG. 38 after another addition of 
5 elements. 

FIG. 49 shows redistribution of elements to maintain 
Ordering Scheme. 
FIG. 50 B-tree of FIG. 38 after insertion of G-node. 
10 FIG. 51 B-tree of FIG. 38 after b-tree node split. 

FIG. 52 shows removal of elements from tree of FIG. 38. 
FIG. 53 shows another removal of element from tree of 
FIG. 38. 

_ FIG. 54 shows result of G-node removal from tree of FIG. 
38. 

FIG. 55 parallel B-tree stored on three disks for Example 
of B -trees Stored on Disk section. 

FIG. 56 B-tree of FIG. 55 after element addition. 
20 FIG. 57 shows redistribution of elements to maintain 
Ordering Scheme. 

FIG. 58 B-tree of FIG. 55 after another addition of 
elements. 

^ FIG. 59 shows range split and redistribution of elements 
resulting in creation of new G-node (G-node Split) for 
B-tree of FIG. 55. 

FIG. 60 B-tree of FIG. 55 after insertion of G-node. 
FIG. 61 B-tree of FIG. 55 after b-tree node split. 
30 FIG. 62 B-tree of FIG. 55 after additional b-tree node 
split. 

FIG. 63 data model for a preferred instance of present 
invention. 

35 FIG. 64 flow chart for a preferred instance of present 
invention. 

FIG. 65 shows nine P-nodes related by complex G-node 
Range. 

FIG. 66 diagram of hypercube network with terminals and 
40 disk storage for Example of Application 1. 

FIG. 67 diagram of distributed network showing three 
client terminals, one server and three disk-packs for 
Example of Application 2. 
FIG. 68 is a block diagram illustrating the principles of 
45 the invention. 

PREFERRED EMBODIMENT 

Introduction 

' ^ie preferred embodiment of present inventionj elates to 
50 a process ot cre ating parallerdaia-siructures which ^ apts 
sequential data-structures and their associated processing 
programs for use in parallel or distributed environments. The 
invention achieves this by creating parallel data-structures 
that are identical in form and function to the sequential 
ss data-structures in a parallel environment. The adapted par- 
allel data-structures and methods can be used in the same 
way as their sequential counterparts but in a parallel envi- 
ronment. The sequential data-structures which are adapted 
must have configurations determined by the orderable quali- 
60 ties of the data contained in the data-structures. The sequen- 
tial data-structures and their associated maintenance pro- 
grams generally have three functions in common: Find(), 
I riser tQ and Re move Q fu nctions . 

" - TPrf^58n3epICt'ing applicant's parallel indexing method. 
65 FIG. 68 depicts composite Global Index comprising three 
local indexes; one of the five composite global nodes is 
circled and labeled (a-<l); figure shows common structure 
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and access methods for all indexes, common ranges on all ^Fox^iar^llerinserts and removes, a given processor sim ply 
indexes, local links based on range difference, global links l ocates the position v m ih e^daiaj=stru cture^ither b y ^position 
based on range commonality, and storage of one value "v" q r^yalue. Multiple processors may then coope rate to search- 
on index 2 in Range (t-z). A query on the value "V" can f or a desired data element y within position v. The proces ^ 
originate on any index 1, 2 or 3 equivalently. Assuming the 5 sors sear ch through the 1 =ngj(2P-l) elements at the p osi- 
query starts on index 1, the request may be shunted imme- lions v aTall processoTsYfl g i g P) in parallel. If the remova l 
diately to either index 2 or 3 if the processor for index 1 is pfan-elem ent v at position v le a ves pos i tion v sufficientl y 
busy: indexes 2 and 3 could also pass control to any other emrjty^th en each processor re-_n rr Wp it* Hata-stmrrnrr. 
local index. The query travels down the rightmost local link ac cording to the missing. position v corr es pondin g to the 
on the controlling processor, locating the range (t-z) as the 10 el ement y in the same way that the sequential me tkodjwould, 
range to hold v; in the preferred embodiment, index 2 is If 1 trie" addition of an element y requires more nodes to 
immediately calculated as the specific index to hold V within contain the larger data set, then each processor re-orders its 
the range (t-z) by virtue of its being the center of range (t-z) data-structure according to an additional position w corre- 
If index 2 does not already have control, the query then sponding to the element y in the same way that the sequen- 
traverses the rightmost global link to index 2 at range (t-z) 15 tial method would, 
and index 2 accesses the range and thereby the value within 

(v), This process produces at least three different paths to v Preferred Embodiment— Uses of Data-Structures 
chosen dynamically at query time; if the query started on -phe uses 0 f the adapted parallel versions of the data- 
local index 2, then it requires 2 accesses ( 1 at the root plus structures and maintenance programs are the same as the 
1 at the local node (t-z); if it started on either local index 1 20 uses 0 f me j r sequential counter-parts, only in a parallel 
or 3, then it requires 3 accesses ( 2 for the local index plus environment. The speedup of the parallelization brought 
1 at index 2 ). about by the present method is very efficient and justifies its 

In the parallel data-structures, each processor may contain design, 
multiple elements at any position v within the structure. The 

number of elements contained at position v is determined by 25 Preferred Embodiment 

the Rule for Fullness and Ordering Scheme for the given Definitions 

parallel data-structure. The simplest Rule and Scheme allow Many terms must be defined to adequately describe the 

zero (0) through two (2) elements per processor to be Process of Adaptation. 

contained at any position v. Such a simple Rule and Scheme At Will (Implies Blind) — an activity that a processor may 

are assumed for the introduction and any other section of this 30 perform at any time regardless of the activities of other 

application unless otherwise stated. processors; 

In the lnsert() function, for sequential data-structures, the Blind (Blindly) — Activities performed by a processor or 

insertion of an element y [Insert(y)] results in the placement set of processors with no cooperation from other processors; 

of the element y in the sequential data-structure at some Cooperative (cooperatively) — activities performed by a 

position V. The position v is determined the element y's 35 set of processors that require communication and/or coor- 

orderable or ordinal relationship to the other elements and dination between processors; 

the positions of the other elements in the data-structure. The .Data-structure — an org anization or method of organi za- 

position v is determined by the "rule for insert" for the given tinn for rlata Prefer ably, the data-structures are based fl brm 

sequential data-structure. Position v is also determined by o r_ configuration and functio ning) nn orderable data; e^ g. 

the rule for insert in the parallel data-structure: each pro- 40 heaps, B -trees, b inar y search trees T etc.; 

cessing element creates a configuration for the data-structure Defined G-node: See G-node 

identical to the configurations at all other processing ele- Element — a single data -value within a data-structure, 

ments. Each element may be of any type (e.g. integer, real, char, 

Using the Rule for Fullness and Ordering Scheme men- string, enumerated, pointer, record, or other). The elements 

tioned above for parallel data-structures, each processor may 45 must all relate to each other in some orderable fashion; 

contain as many as two elements y 3 , and y 2 at position v. Element Deletion — Removal of an element from a 

Consequently with P identical data-structures, one at each G-node; 

processor, there exist in total l^n^(2P-l) elements y i} - Element Addition — Insertion of an element to a G-node; 

(l = i=P) (l=j=2) at all positions v, taken cumulatively. Explicit G-node Range — see G-node Range 

Any processor i (l^i^P) may insert any element y into the 50 Global — all of the processors; 

parallel data-structure, and the element y will be placed at G-node (Global Node) — a set of P(number of processors) 

position v in one of the data-structures held by one of the P-nodes. Each G-node contains 0<n<(xP) elements (x«Max 

processors. Although this may result in different configura- number of elements in each P-nbde). In the preferred 

tions for the sequential and parallel versions of the embodiment, each P-node in a G-node occupies the same 

structures, the essential relationships between the data ele- 55 position in each per-processor data-structure. Each P-node 

ments in the data-structures will remain the same for both in a G-node contains the G-node Range of that G-node. The 

versions of a given data -structure. G-node functions in the parallel method in the same way an 

The RemoveO function for sequential data -structures has S-node functions in the corresponding sequential method, 

oneof two forms. A RemoveO according to position finds an The G-node uses the G-node Range to relate to the other 

element in a given position in the data-structure and removes 60 G-nodes. G-nodes are created simultaneously with the 

the element. A Remove() according to value, searches the P-nodes which are contained in the G-node. 

data-structure for a given value of y and removes the G-nnrie^ have the following properties: each has a G-no de 

element. In both cases, the data-structure may be re -ordered Range- all the G-node s in a parallel data -structure m ay 

to compensate for the absence of the removed element. In an be come full or^rg pty o r partially empty; w hen a G-nod e 

adapted parallel version of a given dala-structure, any or all 65 becomes full, it isSplit: when a G-node becomes suffici ently 

of the processors may execute a Remove(y) function appro- e mpty, it is deleted. The determination of when a G^ nede is 

priate to the sequential data-structure with the same result. full or sufficiently empty depends on the Rule for Fullness 
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for the G-node. Each G-node is composed of P sets of data 
elements within the G-node Range; each of the P sets may 
contain from 0 to X elements. A Defined G-node is a G-node 
with a fully defined G-node Range. An Undefined G-node 
has a G-node Range with one or more boundaries left open 
or undefined. 

G-node removal — deletion of a G-node from the data- 
structure; this effectively removes old G-node Ranges from 
the data -structure; 

G-node insertion — Addition of a G-node from the data- 
structure; this effectively adds new G-node Ranges to the 
data-structure; 

G-node Range — The G-node Range is the range of values 
that the G-node may contain, in the preferred instance, a set 
of two values R(G otc? )={R(g ( , ll ), Rfe^)} that are the mini- 
mum and maximum values of the elements which may be 
contained in the G-node G. The G-node Range determines 
the proper placement of the G-node within the parallel 
data-structure and thereby determines the proper placement 
of an element or P-node within each per-processor data- 
structure; 

The G-node is stored across multiple processors, but the 
G-node Range uses the same range for each component of 
the G-node on each processor. The Range is stored with the 
G-node, The Range may be stored either explicitly or 
implicitly: explicit storage of the G-node Range is the listing 
of the values that define the range; implicit storage would be 
the storage of one or more values from which a range could 
be calculated. 

G-node Split — A G-node Split occurs when a G-node 
becomes full. The Splitting process divides all of the values 
contained in the G-node into two roughly equal sets X and 
Y with distinct ranges. One set X remains in the G-node, the 
other set Y is stored in a newly created G-node. The G-node 
Ranges of the two nodes are set according to the division of 
the sets X and Y. The G-node Split is a method of adding 
new G-nodes to the set of G -nodes comprising the parallel 
data-structure; by virtue of the G-node Range as the basis for 
this process it is also a range addition method. 
Implicit G-node Range — see G-node Range 
Link — representation and reference to the relationship of 
adjacent nodes; 

MAXVAL — Maximum possible value (oo) 
MINVAL — Minimum possible value (-00) 
Ordering Scheme — The manner in which data elements 
are arranged within a G-node. May be ascending, 
descending, partially or fully sorted, completely unordered 
in addition to many other arrangements. Different Schemes 
may be defined for different data -structures. Schemes may 
be defined to provide efficient access paths, efficient data 
distribution, proper placement of an element into an appro- 
priate P-node within a G-node, or other provisions; 
Ordinable — data that has the capacity to be ordered; 
P — number of processors on a parallel machine or dis- 
tributed network; 

Parallel — processes or entities performed or existing on 
multiple processing or memory storage units, designed to 
perform or exist on multiple processors or system 
components, or having a structure that lends itself to similar 
distribution; 

Parallel Data -structure or Global Data -structure — the 
data-structure that results from applying this process of 
adaptation to a sequential data-structure. A parallel or Global 
data- structure is composed of a set of P sequential data- 
structures each of which is composed of a set of P-nodes and 
incident links. The P-nodes and links form precisely the 
same configuration on each processor. 
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Partially Defined G-node: See G-node 

Partially Undefined G-node; See G-node 

Per-processor — sequential activities on a processor or on 
a set of processors: this term conceptually divides a parallel 
5 entity or process into its sequential parts and refers to each 
processor's activity separately; 

P-node (Parallel Node) an adaptation of an S-node. A 
P-node contains 0 to n elements which fall into its G-node 
Range. In addition, a P-node relates to the other P-nodes in 
the data -structure not only by the value of the elements 
contained in the P-node but also by the P-node 's G-node 
Range which is contained in each P-node and determined by 
the G-node to which the P-node belongs. Each P-node in a 
data-structure is part of a G-node. When converting an 
S-node into a P-node, extra links are not added for the extra 
15 elements. The rules for relationships between P-nodes on a 
processor are the same as the rules for relationships between 
the S-nodes of the sequential data-structure from which the 
parallel version was derived with respect to G-node Ranges. 
Except for P-nodes created at the very beginning of the 
20 process, P-nodes are generally created through the splitting 
of G-nodes; 

Processor — a processing element or CPU with or without 
its own local memory in a parallel or distributed environ- 
ment. The processors are all interconnected in the parallel 
25 machine or network by communication lines or by shared 
memory. Also used to refer to any system component to 
which work may be distributed.; 

Range Relation Function — This function RQ determines 
how G-node Ranges relate to each other (i.e. less than, 
greater than, equal to, subsets of, supersets of each other, 
etc.); 

Range Determination Rules — these rules determine 
ranges for the data: in the preferred instance, the range is 
based on data placement (the number, value, distribution 

35 and/or positions of element values for splits); however, 
ranges may also be set to force a change in the data 
placement, or set according to other criteria; 
, Rule for Fullness — The rule by which the fullness or 
emptiness of a G-node is determined. Full G-nodes are sp lit; 

^ empty G-n odes are removed. Different Rule s ma y be_defjned 
for different data-structures. The g oal in setting the rules for 
determining Trie fullness of b-nodes is to make the most 
efficient use of space, and processing time. The Rule for 
Fullness expresses and may be used to maintain: range or 

45 G-node fullness, emptiness, range breadth (narrowness or 
broadness), density and distribution of data values within 
data structures, etc.; 

Rule for Insert (Also referred to as Rule for Remove and 
Rule for Positioning Nodes) — the ordinal or orderable rela- 

5Q tionships between the data elements contained in the nodes 
of a given data-structure; especially in the InsertQ and 
Remove() functions of sequential programs and data- 
structures and the same functions (with respect to G-node 
Ranges) for their parallel counterparts; 

5S Sequential or Serial — processes or entities performed or 
existing on a single processor or designed to perform or exist 
on a single processor; 

S-node (Sequential node) — a single cell within a sequen- 
tial data-structure that contains a single element. Each 

6Q S-node relates to its adjacent nodes or to the rest of the 
data-structure according to the ordinable relationships 
between the element the node contains and the elements 
contained in the other nodes; 

Preferred Embodiment 

65 Symbols 

Data-structures — Sets of nodes containing elements, and 
incident links. 
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Elements An element is a single piece of ordinable data. Subscripts — integers or variables between 1 and 9 

Elements are generally depicted in one of two ways: 1. An inclusive, or a naught (o). Subscript numbers will Dot exceed 

element may be thought of as having a constant value; such 9 unless otherwise stated. Naught (o)— represents an 

an element usually belongs to a set that contains members absence of specification of individual members of a set with 

with single subscripts: (w«{w 1? W 2 , W 3 , . . . Wj); 2. An 5 ^gard to the subscript position, and, thereby, will define a 

element may also be referenced by its position in a data- sub " set - Variables with three(3) subscripts have the follow- 

structure; these are usually referenced by the subscripted cell m S subscript order: S ifk : S processor ^mber, node number, ceii 
letters to which they belong: (x={x in , x 312 , x 121 , x 122 . . . } 

Elements are also frequently depicted as their actual values, • Preferred Embodiment 

both in set and graphical form. 10 Preferences for Adaptable Data-Structures 

G-node Ranges : AG-node Range is considered consistent 1). Ordinal data is preferred for the elements contained 

over the entire G-node, and therefore has the same kind of and ordered within the sequential data-structure by the 

notation at each P-node. Because the actual values of the sequential method and/or rules of ordering, 

range may be explicit or implicit, the Range is indicated by 2). The sequential data-structure is preferred to be capable 

a function reference [R(G-node)]. The parameter G-node is 15 of representation by nodes containing data elements and 

expressed in the manner appropriate to the given example. links that relate the nodes according to the relationships of 

If the function receives an element parameter, it may then be the elements contained in the nodes. The relationships 

used to compare the element to ranges to determine proper represented by the links may relate the node to adjacent 

placement of the element. The function R() may be called by nodes, non-adjacent nodes and/or to the rest of the data- 

any processor and may use any other additional parameters 20 structure as a whole. 3). The adapted nodes are preferred to 

needed to calculate the Range. In the preferred instance, the have the capability of the calculation of G-node Ranges 

result is a minimum and maximum value allowable for the which may be related to each other in an ordinal fashion. 4). 

P-node and/or G-node: R(A 0 . ( , 5 _^ (a(>n ),R(a 01 - 2 )}= Contiguity: the structure is preferred to have the quality that 

{minimum, maximum}; the naught in the third position may the placement of nodes makes the data ranges contiguous 

generally be replaced with a 1 or a 2 indicating the limits of 25 with respect to the structure and rules of the graph and the 

the Range; most G-node Ranges consist of two values. data set contained and organized according to its ranges. 

Example (for integer type elements): G-node T o5o , has Preferred FmhnHimem 

G-node Range R(T o5 >{R(t o51 ), R(t oS2 )}-{75, 116}; this n . _ . . Embodiment 

,u j i-p 1 v«5i/> >o52/j J > General Description 

means that G-node T o5o a may contain elements between 75 r c . . . 4 , ., . , t 

j 1 -i * i j u • . * *u a t ,„ The purpose of this section is to describe the data- 

and 116 in value and still be consistent with the data- 30 t t F Y . , . . , 4 , . . . . 4 

. „ j. „ , „ « . 4 , structures and functions in a less technical manner than that 

structure rules. On diagrams, G-node Ranges are depicted r . , , . ■ j • 

. i • j j *u ti j rr> j \ * u- if *i. of the pseudo -code contamed in other sections, 

parenthesized under the P-nodes (G-nodes) to which they * • i i- a a * t*u 

r . \ / j This section contains only a generahzed description of the 

V< . 4 r n j i * a u u. • ^ a present method and does not contain all the details of the 

G-nodes — a set o f P-nodes related bv their G-node f . _ £ , A . , . . . , . 

Ranges, if the processor number subfen pTandl E^ment 35 mvenUon. For ease of understanding the description m this 

< — t . f ■ — g r-~ 1 - ,l iu section is presented using graphical depictions of the data- 

mimber subscript of a set me mber are naug ht then the f . 7 w ■ i w ^ 

— - - — . r Z. " TeF — structures m their sequential (single -processor) forms along 

representation is of a fi-nnrfe. The set T a s a set of G-nodes: J- . . v & - tt f . . 7 . ^ * 

~ £7- ; ; — lTL . 7 , r-71 c with accompanying descriptions of the graphical figures; the 

? r o2 7,>' • Uj ? e ^ h X\T r " parallelized forms of the data-structures are then depicted in 

G-node is rather unique, being distributed amongst proces- ? _ . A , . 4 f L , . 

a *u Vt^c 1 a~> . * *u * ♦ t * n the same fashion. It may serve as an introduction to the basic 

sors around the page. FIGS. 1 and 2 contain the same set of 40 - , J t , t . , .„ „ . 

j , , iu* a niu* concepts 01 the present method so that the reader will find 

data values on a serial b-tree and a parallel b-tree respec- , f . r . . „ 

1 ms> . • « /c\ o a a a 0 the other descriptions easier to follow, 

tively. FIG. 2 contains five (5) G-nodes: A rt1rt , A-^, B rt1/1 , ^ . , . , . ■ 1 t • 1 

^ ^ * • .u 11 1 r» » • Hr> 1 . • The problem that the invention solves is presented in the 

®oio- Assuming the parallel B-tree in FIG. 2 contains *L. . V • *u < 

. « , - . # j . 1 , . . „ section Discussion 01 Problem. Ine example given in that 

the set S of integer data elements, we depict the set S as - . lt 1 

j . . . j ^ a / o r a n section tunctions in the same manner as the examples erven 

data-structure nodes, G-nodes, and P-nodes: S=HA. B, C, 45 . , . , ..... . 1 ■ 5 1 

n\ tiA A \ (U 1 Ic \ (T) \\ iff A here, but the description in this section explains the under- 

a 1 1 a°' A \\ in ^\Irr\^n , J y in S functionality that produces the results shown in the 

"zioh iA 12fl , Aaaoji, IBuo-Waioh l^iio, Wio)> i^iio. prev i ous section. In addition, two Examples are shown here 

D 71fl l G-node A rt7 „ comprising P-nodes A n 9rt and A??* is r . ' . ^ , 

■j /■« j * r-T/- 5i to ensure that the general concept is understood to apply to 

identified in FIG. 2. , ^ ^ f 

n j u /n i \ *■ n,i™ various types or data-structures. A complete description of 

P-nodes — assume a number CP>1) of processors: P-nodes 50 , , f , , . 1, , , , • ^ • 

t , * n \i i u * . the present method, the basic parallel method, its function- 
assume multiple elements; P-nodes have threef 3) subscripts ,. , , ' iL * •, j j u j • 
(processor-number, node-number, element or cell number). ^ and he data -structures that it produces are described in 
A P-node has multiple cells for elements; when the element- other sectl0ns - 

number subscript is specified, the reference is to a specific Preferred Embodiment 
cell within the P-node; when the element-number subscript 55 Description of the Shapes of Single-processor Data- 
is naught (o), the reference is to the entire P-node. Reference Structures and their Multi-processor Counter-parts 
to P-node cells: (on processorl) T={t m , t 1J2 , t 12i , t J22 , t 131 , This section describes the configuration of adapted par- 
ti 32 K Reference to P-nodes: T«{t il0 t i; > D) l J30 }- F° r greater allel data-structures, how they are stored on multiple pro- 
convenience, P-nodes may be identified by a node letter and cessors or memory storage devices, and how they are similar 
non-subscripted processor number (e.g. Al, A2, Bl, B2, 60 to their single-processor counter-parts. The values of the 
etc.) See G-nodes. data elements in a single-processor data -structure determine 

Sets — S, P, G-nodes and sets of elements. Sets are des- its shape, [see single B-tree FIG. 1] The single-processor 

ignated by upper case letters; members of sets are generally B-tree in FIG. 1 contains 12 distinct values and has 12 

designated by lower case, subscripted letters; distinct positions for those values. The numerical relation- 

S-nodes — nodes within a sequential data-structure. 65 ships (greater-than/less- than) between the 12 elements in the 

S-nodes in a set generally have only one subscript S={S 1 , S 2 , B-tree in FIG. 1 determine the shape of the tree and the 

S 3 , . . . }; positions of the elements. 
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The values of the data elements in a multi-processor 
data-structure also determine its shape; however, they deter- 
mine its shape according to the ranges of values into which 
they fall [see parallel B-tree FIG. 2]. Although the same 12 
elements populate the parallel B-tree in FIG. 2, the shape of 5 
the parallel B-tree is not determined by the positions of 12 
distinct elements, but by the positions of 5 distinct ranges 
that the 12 elements fall into. 

Comparing the trees in FIG. 1 and FIG. 2, we see that the 
elements 20 and 26 occupy node C in FIG. 1. The contents 
of node C are determined by the fact that the parent node A 
of the tree contains the two values 15 and 30: therefore all 
elements greater than 15 and less than 30 are placed in node 
C. The Adapted parallel version of the tree in FIG. 2 also has 
a node C; however, the parallel node C has two parts CI (on 
processor 1) and C2 (on processor 2) . The contents of the 
parallel node C are determined by the fact that the parallel 
parent node A contains two ranges of values (15 to 20) and 
(40 to 45): therefore all elements greater than (15 to 20) and 
less than (40 to 45) are placed in parallel node C. Therefore 2Q 
the elements in parallel node C fall into the range (21 to 39); 
these elements are 26,30 and 33. 

The parallel B-tree in FIG. 2 is composed of two identi- 
cally shaped trees (one on each processor). The elements in 
these identical trees are also positioned identically within the ^ 
tree according to the ranges that they fall into. This grouping 
of elements according to identical ranges on each processor 
creates a Global-node or G-node: the G-node is a collection 
of data elements in identical positions within identical 
data-structures contained on multiple processors or process- 3Q 
ing components. Each G-node has its range (G-node Range) 
recorded on each processor. The G-node with G-node range 
(40 to 45) is positioned as the second entry in node A in FIG. 
2. If the value 43 were Inserted by either processor into the 
parallel B-tree, then it would take position in this G-node 35 
because it falls into the G-node range (40-45). This G-node 
would then contain the values 40,43, and 45. The concept of 
the G-node is central to the functioning of the parallelized 
method/data-structure: once the concepts of the G-node, the 
G-node Range and the G-node Split are firmly grasped, the ^ 
present method should be fairly easy to comprehend. The 
G-node Split is explained in the following section. 

Preferred Embodiment 
Verbal Description (Insert and Remove) 

This section gives a verbal description of how the pre- 45 
ferred embodiment functions. Adapted parallel data- 
structures created by the preferred embodiment are always 
composed of P identical data -structures, each contained on 
one of P processors or system components. The adapted 
parallel data -structures take form and are organized accord- 50 
ing to the same principles (with respect to G-node Ranges) 
that form and organize the single -processor data-structures 
from which the parallel versions are derived. 

As mentioned previously, the single -processor data- 
structures to be adapted are created and maintained through 55 
the use of Insert and Remove functions for the respective 
data-structures. The ability to Insert and Remove from 
ordered lists of data implies the ability to search. Search 
(Find) functions are preferred aspects of the single-processor 
Insert and Remove functions in general. 60 

The multi-processor Insert, Remove and Find functions 
may be originated at any time, on any processor (1 to P). The 
processor originating the Insert, Remove or Find function 
may or may not need to involve the other processors in the 
effort. In some cases these functions can be executed by a 65 
single processor within the parallel or distributed system. 
Whether or not other processors need to be involved, 
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depends on how much room there is in a G-node for Insert 
and whether or not a specific value is present on a given 
processor for Find or Remove. 

For this general description, the parallel versions of the 
Insert and Remove functions may be said to have three 
phases: (1) Location of the proper G-node on the originating 
processor (2) Location of the proper processor (1 through P) 
with insertion or removal of the element on that processor 
(3) Performance of G-node Split or G-node deletion if 
necessary. Step 1 can be performed by any single-processor, 
at any time, independent of the other processors. Step 2 
involves more than one processor in a cooperative effort 
unless the "proper processor" is the processor that originated 
the Insert or Remove. Step 3 usually requires all processors 
to communicate for a G-node Split because the elements in 
the G-node must be sorted across processors for a Split; Step 
3 usually does not require all processors to communicate for 
a G-node deletion. 

The following steps 1 through 3 are also identified in the 
pseudo-code for the parallel Insert and Remove functions 
given in the Program Adaptation section. 
Step 1 

(Location of the proper G-node on the originating pro- 
cessor (Find G-node)) 

The functioning of the parallelized method depends on the 
functioning of the single -processor method. The single- 
processor method functions according to the relationships 
between the values of the elements: the multi-processor 
method functions according to the relationships of the 
ranges of values of the elements. 

The search of an ordered list is performed by comparing 
the values found at positions within the data-structure. For 
Example: Searching the single-processor B-tree for 33 in 
FIG. 1, we start at the top node and compare the values. 33 
falls between 30 and 45, so we travel down the link under 
45 and find node D. Searching node D from left to right we 
immediately locate 33. Searching the multi- processor 
B-tree for 33 in FIG. 2, we start at the top node and compare 
the ranges. 33 falls between the ranges (15 to 20) and (40 to 
45), so we travel down the link under (40 to 45) and find 
node C. Parallel node C may be located by either processor 
1 or processor 2. Searching parallel node C for the value 33 
is described in Step 2. 
Step 2 

(Location of the proper processor (1 through P)) Once a 
given processor p has successfully located the proper 
G-node within its data-structure, it may then send the 
location of this G-node to the other processors in the system. 
Each of these processors may then attempt to locate the 
search value within its own portion of the G-node or attempt 
to place a value in the proper G-node. 

In Step 1 above, we located G-node C (FIG. 2) as the 
proper node for 33. If the originating processor is processor 
1, it sends a request to processor 2 to search G-node C; 
processor 2 then searches its portion of G-node and finds the 
value; it may then Insert or Remove the value from the 
data-structure. If the originating processor is processor 2, it 
immediately locates the value 33 and need not make any 
request of processor 1. 

Whether or not the originating processor needs to send 
requests to other processors for location of values is depen- 
dent on the ordering of values within the G-node. The 
data-structures in FIG. 2 have G -nodes with unordered 
internal values. For a discussion on ordering values within 
G-nodes, see other sections. 
Step 3 

(Performance of G-node Split or G-node deletion if 
necessary) A G-node Split is the creation of a new G-node; 
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a G-node deletion is the destruction of an existing* G-node. The following two Examples explain the underlying func- 

When a G-node is con sidered full, it is Split; whe n it is tioning of the preferred embodiment by maintaining the 

consKler.eg empty (or sufficie ntly empty), it is dest royedand same list of values on two parallelized data -structures. Each 

d eleted fromlhe data-structure. Th e fullness or emp tiness of Example first describes the functioning of the single- 

a (j^noaj gsim ilar in ConcepHonlo the fullness or emptiness 5 processor version of the data-structure on a similar list. Each 

o nngde"Tn"a sing le-processor b-t ree or m-way search tr ee. Example then describes the parallel version of the data- 

TtiGjuje for Fullness in this section is set forth in detail structure. These Examples both use three (P-3) processors 

below"" " — ~" ~ — " ^ or ^ e P arau< el data- structures. 

•"-^^""^7 -. j • . , r- - - Th e single -processor versions of the lists are roughly 
-Hie G-node Split is described in the definition section w onc . third the size of the parallel versions. Each single value 
above; this definition is sufficient for the General Descnp- inserted 0f ddetcd in the singlc . proccssor data-structure is 
tion. For a more detailed description, see the Function matched by vahies for the para]lel vcrsion ^ 
Explanation sections. The G-node deletion is simply the multiple values inserted and deleted in the parallel version 
removal of the G-node from the data-structure. are specifically chosen to fall into the proper G-node Ranges 
Once, a G-node is created or destroyed, it must be added 15 so that the single and multi-processor data- structures take on 
or deleted from the given data- struc ture according to the the same configurations: this is done so that the identical 
RuJes_iQ L_the data-structure with respect to the G-nod e functioning, form and structure of the single and multi- 
R anges. Examining the data-structure on process or 1 in FIG, processor versions can be easily seen. The functioning of the 
2 'we see that it is a valid B-tree, in its own right, rega rdless parallel versions is in no way dependent on any choice of 
of the existence of other processors: if we remove tEevaTue 20 element values (any list of ordinable data elements may be 
5'FFSnT this serial B-tree, we produce the B-7ree~"on processor Inserted or Removed in ™y order )- 
1 in FIG. 3. When both processor 1 and 2 perform this EXAMPLE 1 
removal simultaneously, each processor redistributes the Single -processor Method 

B-tree according to the rules of B-tree configuration: the ^ The single-processor AVL tree method is composed of 

result is a G-node deletion. Note that this would require the finding the proper location for a new node, adding that node, 

absence of all three values in the G-node: 4, 5, and 12. The and performing rotation. 

point being made here is that G-node additions and deletions Example 1 begins with FIG. 4, showing a properly 

function according to the same rules as the single-processor ordered single-processor AVL tree containing the elements 

data-structures. This process is clarified further in the two 3Q from the single -processor initial list, 

examples in the following section. 1 .) Insert(60) 

Comparing values at each node: Root-node A: [60>40], 

Preferred Embodiment travel down the right-most link to node C; node C: [60>50], 

Descriptions by Example travel ri S ht to node F i node F: [60<70]— F has no left link 

* • i „ , t , t , , « so we create a new node G and place it to the left of node 

A single -processor method creates its data-structure (such p ,pj^ ^ r 

as a B-tree) by Inserting and Removing the values contained ' i T , ' , , < », TT . 

in the list to be maintained according to the Rules of the . *° de ° * as be f" adde f d «> lts P r °f' ^ML** 

Insert and Remove functions for the data-structure. The * left unbalanced > there f°« we P e *™ rotatlon ( 

Method of Adapting single-processor methods and their '* 

associated data -structures into multi-processor methods and 40 zy Insert(8U) 

data-structures makes use of the single-processor method. ro ^ Qodc * ^ 0>40 ^ I™™ 1 ^J 0 ™*?®* G: 

Each of the following two Exampleswill create precisely the [80>60] travel right to node F; node F: [80>70]-F has no 

same configurations for their data-structures in both the n f ht * ink J^ e crcate a n ^ ™ de » * nd P lace U * "S* 

single and multi -processor versions described. A reader of node R ^ tree 15 stlU balanced ( no rotatlon ) ( FIG " 7 >' 

understanding the functioning of AVL trees and B -trees 45 ^.) Insert(90) 

should be able to see the functioning of the multi-processor Root-node A: [90>40], travel right to node G; node G: 

method as a transformation of the single-processor method P0>60], travel right to node F; node F:[90>70], travel right 

in each case. The present method transforms the single- t0 node H i node H: [90>80] H has no right link, so we 

processor method into the multi-processor method. create a new node 1 and P lace !t t0 the n 6 ht of H ( FIG - *)• 

A . . , . . - ii i * j j * * 50 Node I has been added in its proper place: the AVL tree 

Any implementation of a parallelized data-structure may . , , . , f J- nn * /rv^ 

.... ' f. .u n i r t% ii c j j is left unbalanced, therefore we perform RR rotation (FIG. 

utilize vanations on the Rules for Fullness of G-nodes and a\ 

the Ordering Scheme of elements within G-nodes. The ^' 

following rules and ordering scheme will be used for these zy Remove( 60) 

two Examples: Root-node A: [60>40], travel right to node G; node G: 

55 [60=60], therefore delete node G; replace node G with the 

1. Rules for fullness/emptiness: each G-node in these left-most child of the right sub-tree (node F). The tree is still 
Examples will be composed of 3 sets of elements; each balanced (no rotation) (FIG. 10). 

set will contain zero through two elements. A G-node Multi-processor Method 

i is full when all three sets contain two elements (the The multi-processor AVL tree method is composed of 

G-node therefore containing six elements). The G-node 60 finding the proper location for a new value, inserting the 

is empty when all three sets contain zero elements. values until a G-node Split thereby creating a new G-node, 

2. Ordering Scheme: each P-node set within a G-node is adding that G-node, and then performing rotation in parallel, 
contained on a single-processor (PI, P2 or P3 ). The Refer to Steps 1 through 3 in the Verbal Description 
elements in the G-node will be kept in ascending order section. 

across the processors and evenly distributed (PI con- 65 Example 1 (multi-processor) begins with FIG. 11, show- 

taining the smallest values, P2 the mid-most, P3 the ing an Adapted AVL tree composed of 3 properly ordered 

largest). AVL trees on 3 processors containing the elements from the 
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multi-processor initial list. The G-node Ranges are shown Split divides the values 74,75,78,80,85,89 into two ranges. 

beneath the parallel nodes. The lower range is placed into a newly created G-node 

1.) Insert(60) [after which Insert(65)] which is in the AVL tree node (F), the upper range is kept in 

Insert(60) at processor PI the G-node in the newly formed AVL tree node (H) (FIG. 

Step 1 5 20). 

Comparing values at each G-node: ^ AV l tree node (H) containing the old G-node is 

Root-node Al: [(60)>(4O-49)], travel down the right- placed to right of node F because its G-node Range (79-max) 

most link to node CI ; node CI: [(60)>(50-59)] travel right is ter than the G _ node R of F (71 _ ?8) (FJG 21) ^ 

to node Fl; node Fl: [(60)^60- max ]^0 falk within addition of the new node H leaves the AVL tree balanced (no 

G-node Range (60-max), so we add 60 to this G-node at 10 rotat j on ^ 

StepT^ PL 3 *) Insert(98),Insert(95),Insert(90) 

The values are properly ordered within the G-node, so three Insertions are performed simultaneously, 

step 2 is not necessary. Because the Insertions at processors PI, P2, and P3 all 

Step 3 follow the same procedure, we will follow Step 1 only at 

The G-node is NOT full, so Step 3 is not necessary (no 15 processor P3. 

G-node Split). (FIG. 12) Insert(98) at PI, Insert(95) at P2, Insert(90) at P3 

Insert (65) at Processor PI Step 1 at Processor P3 

Step 1 Root-node A3: [(90)>(40-49)], travel right to node G3; 

Root-node Al: [(65)>(40-49)] travel right to node CI; node G 3: [(90)>(60-70)], travel right to node F3; node F3: 

node CI: [ (65) > (5 0-59)] travel right to node Fl; node Fl: 20 [(90)>(71-78)], travel right to node H3; node H3: [(90)- 

[(65)=(60-max)]-65 falls within G-node Range (60-max), (79-max)]— 90 falls within G-node Range (79-max), so we 

so we add 65 to this G-node at processor PI. (FIG. 13) add 9Q tQ this G . node at processor P3 . Following identical 

^ te P ^ comparisons at processors PI and P2, the values 98 and 95 

As FIG. 13 shows, the G-node Fl at processor PI has 3 have been added to the samc G . nodc (pro. 22 ). 

values. This is greater than maximum number of values per 25 2 

processor per G-node, so we perform Step 2 to properly M mG ^ sh ^ yalues are ^ ed m ascend . 

order the values within the G-node in F. Processor PI sends ; order within thc G . node in H SQ W£ form St 2 

the value 70 to P2, and P2 sends 75 to P3. Tms exchange of p * cessor P1 scnds lnc value 98 t0 P3j ^ P2 scnds 85 to 

elements maintains the Ordering Scheme withm G-nodes pl ^ ako g5 tQ p3 p3 ^ g9 amJ 9Q tQ p2 ^ 

locating .^e proper processors for each value (rule 2 toed 30 of elements maiat!iias me Ordering Scheme within 

' ^ " G-nodes, locating the proper processors for each value (rule 

L AA-r *<c< *u • ^ a >u r a ■ *n 2 listed above) (FIG. 23). 
After addition of 65 to this G-node, the G-node is full. 



Step 3 



therefore we perform a G-node Split. The G-node Split . . .... rno nr , nrw . ^ , . u „ . 

j • ■ i , £n s C _ n - A _ c _o ■ « t ^ r After addition of 98. 95 and 90 to this G-node, the G-node 

divides the values 60,65,70,74.75,78 into two ranges. The 35.-.. , f ' , j 0 ^ ^ . 

. . . , . \ , t j ~ , /. r v is full, therefore we perform a G-node Split. The G-node 

lowerrange isplacedinto anewly created G-node (m G), the ^ ^ ^ ^ 80>85;8990j95>9 / mto ^ ranges . 

^TaiS* (A 1 pla«d e to the left of the G-node T he 1 u PP er «Dge is placed into a newly seated G-node (I), 

. „ . V. j n rtitx nrx\ • i *u +u the lower range is kept in the G-node in H (FIG. 24). 

in F because its G-node Range (60-70) is less than the % r , . v . ' 

G-node Range of F (71 -max) (FIG. 16). 40 ^ new G" node 00 » Pl»«d to nght of G-node H 

The addition of the new node F leaves the AVL tree ^. ecau f D lls G " n ° d u e (™™% » lha " * e 

unbalanced as it did in the single-processor example, there- G ™«' Ran f , of V 9 -* ®) ( 25) The addi ion of he 

fore we perform RL rotation in parallel (FIG. 17). 2.) new G-node leaves the AVL tree unbalanced as it did in the 

Insert(80) ) Insert(89) ) Insert(85) smgle-proce^or example, therefore we perform RR rotation 

These three Insertions are performed simultaneously. 45 in P ara e \ • )• 

Because the Insertions at processors PI, P2, and P3 all 4 0 Remove(55), Remove(65), Remove(70), [after which 

follow the same procedure, we will follow Step 1 only at Remove(60)] 

processor P2. Remove(55) at PI, Remo ve(65) at P2, Remove(70) at P3 

Insert(80) at PI, Insert(89) at P2, Insert(85) at P3 (performed simultaneously) 

Step 1 at Processor P2 50 Remove(55) at PI 

Root-node A2: [(89)>(40-^9)], travel right to node G2; Step 1 

node G2: [(89)>(60-70)], travel right to node F2; node F2: Root-node Al: [(55)>(40-49)], travel right to node Gl; 

[(89)=(71-max)]— 89 falls within G-node Range (71-max), node Gl: [(55)<(60-70], travel left to node CI: node CI: 

so we add 89 to this G-node at processor P2. Following [(55) -(50-59)]— 55 falls within the G-node Range (50-59), 

identical comparisons at processors PI and P3, the values 80 55 so processor PI looks in node CI for the value 55. 55 is not 

and 85 have been added to the same G-node (FIG. 18). { n node CI at processor PI, so we must perform Step 2. 

Step 2: Step 2 

As FIG. 18 shows, the values are not arranged in ascend- pl a request t0 the other processors t0 i ook for 55 in 

ing order within the G-node in F, so we perform Step 2. their respective nodes a and C3. Processor P2 finds 55 in 

Processor Pl sends the value 80 to P2, and P2 sends 75 to 60 its nodc p2 remo ves the value 55 from node C2 and sends 

Pl and also 89 to P3, P3 sends 78 to P2. This exchange of it tQ pi ^p 1G 2 7). 

elements maintains the Ordering Scheme within G-nodes, ^ 

locating the proper processors for each value (rule 2 listed _ . , 0 - . 

above) (FIG 19) G-nocJft C is not empty and so Step 3 is not necessary. 

Step 3: 65 Remove(65) at P2 

After addition of 80, 89 and 85 to this G-node, the G-node Root-node A2: [(65)>(40-49)], travel right to node G2; 

is full, therefore we perform a G-node Split. The G-node node G2: [(65) =(60-70)]— 65 falls within the G-node 
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Range (60-70), so processor P2 looks in node G2 for 
the value 65 and finds 65 in G2. 
P2 then removes 65 from node G2. 
Step 2 

The value that P2 searched for (65) was found at P2, 
therefore Step 2 is not necessary (FIG. 27). 
Step 3 

G-node G is not empty and so Step 3 is not necessary. 
Remove(70) at P3 

Root-node A3: [(70)>(40-49)], travel right to node G3; 
node G3: [(70)=(60-70)}— 70 falls within the G-node 
Range (60-70), so processor P3 looks in node G3 for 
the value 70 and finds 70 in G3. 

P3 then removes 70 from node G3. 
Step 2 

The value that P3 searched for (70) was found at P3, 
therefore Step 2 is not necessary (FIG. 27). 
Step 3 

The G-node in G is not empty and so Step 3 is not 
necessary. 

Remove(60) at P2 

Root-node A2: [(60)>(4O-49)], travel right to node G2; 
node G2: [(60= (60-70)]— 60 falls within the G-node 
Range (60—70), so processor P2 looks in node G2 for 
the value 60. 60 is not in node G2 at processor P2, so 
we must perform Step 2. 
Step 2 

P2 sends a request to the other processors to look for 60 
in their respective nodes Gl and G3. Processor PI finds 60 
in its node Gl . PI removes the value 60 from node Gl and 
sends it to P2 (FIG. 28). 
Step 3 

The G-no de in G is empty and so we perform Step 3. Th e 
removal oTTrlrtKnode (J is simply a matter of each of the 
processors PI, P2 and P3 performing a normal AVL node 
removal. PI removes Gl from its tree; P2 removes G2 from 
its tree; P3 removes G3 from its tree. Each of the processors 
re-orders the tree according to the single -processor AVL tree 
method and replaces node G with the left- most child of the 
right sub-tree and performs range adjustment (node F) (FIG. 

i?L— — ' 

EXAMPLE 2 
Single -processor Method 

The single-processor B-tree method is composed of find- 
ing the proper location for a new value, adding that value, 
and performing B-tree splits when the B-tree nodes are full 
(contain 3 values). 

Example 2 begins with FIG. 30, showing a properly 
ordered single-processor B-tree (degree 3) containing the 
elements from the single-processor initial list. 
1.) Insert(60) 

Comparing values at each node, moving through the 
tuples from left to right: 

Rpot-node A: [60>20], move right; [60>40] , traveU lp_wn 
t\ t he right-most ~ftrfir~ 10 ■ norfTT); node D: inse rt 60 
) b etween 50 and 70 at node'D fFIG. 31 V 

D now has 3 values and must be split. The right-most 
value goes in the new node (E). The left most value is kept 
in node D; the middle value (60) becomes the parent value 
of D and is r e-inserted at the paxent node ^ (FIG. 32 V 

The parent ho3e A (root-node; now has~3 values and must 
be split. The right-most value goes in the new node (F). The 
left most value is kept in node A; the middle value (40) 
becomes the parent value of A and is re-inserted at the parent 
(no parent exists for the root, so a new root is created — node 
G) (FIG. 33). 
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2.) Insert(80) 

Root-node G: [80>40] travel right to node F; node F: 
[80>60] travel right to node E; node E: insert 80 after 70 at 
node E (FIG. 34). 
5 3.) Insert(90) 

Root-node G: [90>40], travel right to node F; node F: 
[90>60], travel right to node E; node E: insert 90 after 80 at 
node E (FIG. 35). Node E now has 3 values and must be 
split. The right most value goes in the new node (H). The left 
most value is kept in node E; the middle value (80) becomes 
the parent value of E and is re-inserted at the parent node F 
(FIG. 36). 
4.) Remove(60) 

Root-node G: [60>40], travel right to node F; node F: 60 
is found at node F and removed. This leaves F with too few 
15 values, so it removes node E, places its value (70) in node 

D, and makes 80 the parent value of node D (FIG. 37). 
1 Multi-processor Method 

The multi-processor B-tree method is composed of find- 
ing the proper location for a new value, inserting the values 
20 until a G-node Split thereby creating a new G-node, adding 
that G-node, and performing B-tree splits when the B-tree 
nodes are full (contain 3 G-nodes). (The G-nodes constitute 
the elements of the B-tree.) 

Refer to Steps 1 through 3 in the Verbal Description Section 
25 Example 2 (multi-processor) begins with FIG. 38, show- 
ing an Adapted B-tree composed of 3 properly ordered 
B -trees on 3 processors containing the elements from the 
multi-processor initial list. The G-node Ranges are shown 
beneath the parallel G-nodes. 
30 1.) Insert(60) [after which Insert(65)] 
Insert(60) at processor PI 
Step 1 

Comparing values at each node, moving through the 
tuples from left to right: 
35 Root-node Al:[(60)>(20-29)], move right; [(60)> 
(40-49)], travel down the right- most link to node Dl; 
node Dl: insert 60 into right -most G-node in node D. 
Step 2 

The values are properly ordered within the G-node, so 
40 step 2 is not necessary. 
Step 3 

The G-node is NOT full, so Step 3 is not necessary (no 
G-node Split). (Note that although node Dl has three values, 
it contains only 2 G-nodes and therefore does not need a 
45 B-tree split.) (FIG. 39) 

Insert (65) at processor PI 
Step 1 

Root-node Al: [(6 5) > (20-29)] move right; [(65)> 
(40-49)], travel down right-most link to node Dl; node Dl: 
50 insert 65 into second G-node in node D (FIG. 40). 
Step 2 

As FIG. 40 shows, the second G-node in Dl at processor 
PI has 3 values. This is greater than maximum number of 
values per processor per G-node, so we perform Step 2 to 

55 properly order the values within the G-node, Processor PI 
sends the value 70 to P2, and P2 sends 75 to P3. This 
exchange of elements maintains the Ordering Scheme within 
G-nodes, locating the proper processors for each value (rule 
2 listed above) (FIG. 41). 

60 Step 3 

After addition of 65 to this G-node, the G-node is full, 
therefore we perform a G-node Split. The G-node Split 
divides the values 60,65,70,74,75,78 into two ranges. The 
lower range is placed into a newly created G-node, the upper 
65 range is kept in the existing G-node (FIG. 15). 

The new G-node is placed to left of the existing G-node 
because its G-node Range (60-70) is less than the G-node 
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Range (71 -max) (FIG. 42). D now contains 3 G-nodes and Split divides the values 80,85,89,90,95,98 into two ranges, 
must be split (B-tree split). The right most G-node goes in The lower range is placed into a newly created G-node, the 
the new B-tree node (E). The left most G-node is kept in upper range is kept in the existing G-node (FIG. 24). 
B-tree node D; the middle G-node with G-node Range The new G-node is placed to left of the existing G-node 

(60-70) becomes the parent value of D and is re -inserted at 5 because its G-node Range (79-89) is less than the G-node 
the parent node A (FIG. 43). The parent node A (B-tree- Range (90-max) (FIG. 50). Node E now has 3 G-nodes and 
root-node) now has 3 G-nodes and must be split. The right must be split (B-tree split). The right most G-node goes in 
most G-node goes in the new B-tree node (F). The left most the new node (H). The left most G-node is kept in node E; 
G-node is kept in node A; the middle G-node (40-49) the middle G-node (79-89) becomes the parent G-node of E 

becomes the parent value of A and is re-inserted at the parent 10 and is re-inserted at the parent node F (FIG. 51). 
(no parent exists for the root, so a new root is created — B- 4.) Remove(55), Remove(65), Remove(70), [after which 
tree node G) (FIG. 44). Remove(60)] 

2.) Insert(80),Insert(89),Insert(85) Remove(55) at PI, Remove(65) at P2, Remove(70) at P3 

These three Insertions are performed simultaneously. (performed simultaneously) 

Because the Insertions at processors PI, P2, and P3 all 15 Remove(55) at PI 
follow the same procedure, we will follow Step 1 only at Step 1 

processor P2. Root-node Gl: [(55) >(40-49)], travel right to node Fl; 

Insert(80) at PI, Insert(89) at P2, Insert(85) at P3 Step 1 node Fl: [(55)<(60-70], travel left to node Dl: node Dl: 
at processor P2: [(55)-(50-59)]— 55 falls within the G-node Range (50-59), 

Root- node G2: [(89)>(4(M9)], travel right to node F2; 20 so processor PI looks in node Dl for the value 55. 55 is not 
node F2: [(89)>(6O-70)], travel right to node E2; node E2: in node Dl at processor PI, so we must perform Step 2. 
[(89)-(71-max)] — 89 falls within G-node Range (71-max), Step 2 

so we add 89 to this G-node at processor P2. Following PI send a request to the other processors to look for 55 in 

identical comparisons at processors PI and P3, the values 80 their respective nodes D2 and D3. Processor P2 finds 55 in 

and 85 have been added to the same G-node (FIG. 45). 25 its node D2. P2 removes the value 55 from node D2 and 
Step 2 sends it to PI (FIG. 52). 

As FIG. 45 shows, the values are not arranged in ascend- Step 3 
ing order within the G-node at E, so we perform Step 2. The G-node in D is not empty and so Step 3 is not 

Processor PI sends the value 80 to P2, and P2 sends 75 to necessary. 

PI and also 89 to P3, P3 sends 78 to P2. This exchange of 30 Remove (65) at P2 Root-node G2: [(65)>(40-49)], travel 
elements maintains the Ordering Scheme within G-nodes, right to node p2; node F2: [(65)=(60-70)]— 65 falls 

locating the proper processors for each value (rule 2 listed within the G-node Range (60-70), so processor P2 

above) (FIG. 46). looks in node F 2 for the value 65 and finds 65 in F2. P2 

St e P 3 then removes 65 from node F2. 

After addition of 80, 89 and 85 to this G-node, the G-node 35 step 2 

is full, therefore we perform a G-node Split. The G-node value tnat P2 searched for (65) was found at P2, 

Split divides the values 74,75,78,80,85,89 into two ranges. therefore Step 2 is not necessary (FIG. 52). 
The lower range is placed into a newly created G-node, the g te p 3 

upper range is kept in the existing G-node (FIG. 20). ^ G -node in F is not empty and so Step 3 is not 

The new G-node is placed to left of the existing G-node 40 necessary, 
because its G-node Range (71-78) is less than the G-node Remove(70) at P3 

3 H^rt^ Root-node G3: [(70)>(4CM9)], travel right to node F3; 

^S^^^^c^ simultaneously. -de F3: [(70) K 6^70)^70 falls within the G-node 

Because the Insertions at processors PI, P2, and P3 all 45 Range (6O-70) so processor P3 looks in node F3 for 
follow the same procedure, we will follow Step 1 only at the value 70 ™ d finds 70 m ™- P3 then removes 70 

processor P3. fr° m node F3 * 

Insert(98) at PI, Insert(95) at P2, Insert(90) at P3 St ^P 2 nt ^ ^ r ,nr* P , nl 

Step 1 at processor P3: ™ e value tto P3 searched fo F £?\™ & found at P3, 

Root-node G3: [(90)>(4(M9)], travel right to node F3; 50 Ste P 2 15 not nectss ^ ^ 52 >" 

node F3: [(90)>(60-70)1, travel right to node E3; node Me P 3 „ A . „ . , t . e* a • 

E3: [(90H71-78)], move right; [(90)K79-max)}-90 ^ G - node 10 15 not « and 80 Ste P 3 1S not 

falls within G-node Range (79-max), so we add 90 to necessarv - 

this G-node at processor P3. Following identical com- Remove(60) at P2 

parisons at processors PI and P2, the values 98 and 95 55 Root-node G2:[(60)>(40-49)], travel right to node F2; 

have been added to the same G-node (FIG. 48). node F2: [(60=(60-70)]— 60 falls within the G-node 

Step 2 Range (60-70), so processor P2 looks in node F2 for 

As FIG. 48 shows, the values are not arranged in ascend- the value 60, 60 is not in node F2 at processor P2, so 

ing order within the G-node at E, so we perform Step 2. we must perform Step 2. 

Processor PI sends the value 98 to P3, and P2 sends 85 to 60 Step 2 

PI and also 95 to P3, P3 sends 89 and 90 to P2. This P2 sends a request to the other processors to look for 60 

exchange of elements maintains the Ordering Scheme within in their respective nodes Fl and F3. Processor PI finds 60 
G-nodes, locating the proper processors for each value (rule in its node Fl. PI removes the value 60 from node Fl and 
2 listed above) (FIG. 49). sends it to P2 (FIG. 53). 

Step 3 65 Step 3 

After addition of 98, 95 and 90 to this G-node, the G-node The G-node in F is empty and so we perform Step 3. The 

is full, therefore we perform a G-node Split. The G-node removal of the G-node is simply a matter of each of the 
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processors PI, P2 and P3 performing a normal B-tree-node 
removal. PI removes the G-node from Fl in its tree; P2 
removes from F2; P3 removes from F3. Each of the pro- 
cessors re-orders the tree according to the single-processor 
B-tree method and removes node E, places its G-node 5 
(71-78) in node D, makes (79-89) the parent value of node 
D and performs range adjustment (FIG. 54). 

Preferred Embodiment 
Rules for Fullness and Ordering Scheme for B-trees Stored 3 o 
on Disk 

The usage of Rules for Fullness and Ordering Schemes is 
described in the previous sections. The Rule for Fullness and 
Ordering Scheme chosen for those examples assume that the 
parallel data-structure is not stored on disk. A different rule 15 
and scheme should be chosen if the processing-elements of 
the parallel data-structure are disk-packs rather than actual 
CPU's. It should also be noted here that the terms "proces- 
sor" and "processing-element" are used to refer to system 
components to which work may distributed in the mainte- 2 o 
nance of the parallel data-structure: in this section the 
processing-elements are assumed to be disk-packs on a 
system with multiple disk drives; the work distributed 
amongst the disk-packs is the actual reading and writing of 
the blocks that contain the parallel B-tree-nodes. 25 

In this section, another Example of a parallel data- 
structure is given. The purpose of the example is to illustrate 
the functionality of the Rules for Fullness and Ordering 
Scheme chosen for the B-tree stored on disk. The example 
describes one possible embodiment of an adapted B-tree. 30 
The manner of describing this example is the same as the 
manner used in the previous sections. 

The main difference between storing data in memory and 
on disk is that disk access is slower. Assuming that the 
location of the memory block or disk block is known, 35 
accessing data on disk might take milli-seconds whereas 
accessing data in memory would take only micro-seconds. 
Therefore, the goal in designing data-structures to be stored 
on disk is to minimize the number of disk accesses necessary 
to locate the desired data-block. The goal in the design of the 40 
parallel data-structures described in this invention is to allow 
the same data -structure to be accessed simultaneously by 
multiple processing-elements (or disk-packs in this section) 
and thus distribute the work amongst the processing- 
elements. Because the goal in designing data-structures for 45 
disk is to minimize accesses, the Rule for Fullness and the 
Ordering Scheme of a disk-stored parallel B-tree must be 
defined to minimize parallel communication between 
processing-elements (disk-packs) and provide the most effi- 
cient access paths possible to desired P-nodes. Steps 2 and 50 
3 described in the Verbal Description require parallel com- 
munication: the parallel communication in Step 2 can be 
minimized by choosing an Ordering Scheme that does not 
involve all of the disk-packs in locating the proper disk-pack 
for placement of a value. The Rule for Fullness can also be 55 
altered so that determining the fullness or emptiness of a 
G-node does not involve all of the disks. 

The following Rule for Fullness and Ordering Scheme 
will be used in the example for this section: 

1. Rule for Fullness/Emptiness: The fullness of a G-node 60 
in this Example is dependent on the fullness of the 
B-tree node that contains the G-node. A B-tree node is 
considered full when it contains five values (integers) 
and is thereby ready to undergo a B-tree split. AG-node 
may be considered full when one-half of the B-tree- 65 
nodes that contain the G-node are ready for a B-tree 
split; a G-node may be considered empty when one- 



123 

24 

half of the B-tree-nodes that contain it are ready for a 
merge or deletion. This information can be stored for 
each parallel B-tree-node outside of the parallel data- 
structure (possibly in memory). Once one-half of the 
parallel B-tree-nodes are ready to split, one of the 
G-nodes within the B-tree-nodes is split. 
2. Ordering Scheme: This example uses three disk-packs. 
Disk 1 will contain the bottom (smallest) one-third of 
the range of values in a given G-node; Disk 2 will 
contain the middle one -third; Disk 3 will contain the 
top (largest) one-third of the Range. (If the G-node 
Range were (1-100), then Disk 1 would contain any 
value between 1 and 34; Disk 2 would contain any 
values between 35 and 67; Disk 3 would contain values 
between 68 and 100). 
The Rule for Fullness/Emptiness above minimizes the 
need to access all portions of the B-tree-node in question 
because the information for determining the fullness of the 
parallel B-tree-node is stored external to the tree. The 
Ordering Scheme above minimizes the need to access all 
portions of the B-tree-node in question because the location 
of the proper Disk for a given value in a given Range can be 
calculated mathematically: this allows the direct location 
within memory storage of the exact individual node 
(P-node) contained in a given G-node that could contain a 
given data value within the G-node's G-node Range. This 
Example begins with FIG. 55 showing a parallel B-tree 
ordered according to Rules 1 and 2 above. Note that 
although the same values are stored in the tree in FIG. 55 as 
those stored in FIG. 38, the right-most G-node in the tree is 
ordered differently: this is because of the Ordering Scheme 
rule above. At the beginning of this Example none of the 
B-tree-nodes located in the data-structure are ready to be 
split. 

We now proceed to Insert a number of values into the 
disk-stored B-tree in FIG. 55. 

1. ) Insert(60) on Disk 1 and Insert(71) on Disk 2 Simulta- 
neously 

Step 1 

Root-node Al: [(60)>(20-29)], move right; [(60)> 
(40^49)], travel down the right most link to Dl; node 
Dl: insert 60 into right most G-node in Dl. (Disk 2 
follows the same pattern) 
Step 2 

The values are properly ordered within the G-node, so 
Step 2 is unnecessary. 
Step 3 

The G-node is not full (no G-node Split)(FIG. 56) 

2. ) Insert(52) at Disk 1, Insert(51) at Disk 2, Insert(59) at 
Disk 3 

Step 1 

(Step 1 is followed at Disk 2 in order to illustrate the 

functionality of the Ordering Scheme) 
Root-node A2: [(51)>(20-29)], move right; [(51)> 

(40-49)], travel down right-most link to node D2; node 

D2: the value 51 belongs in the G-node with Range 

(50-59). 
Step 2 

Send the value 51 to Disk 1 and place it in the G-node 
with Range (50-59). According to the Ordering Scheme, 51 
must be sent to Disk 1 because it is in the bottom one-third 
of the Range (50-59). Note that Disk 3 is not involved in 
Step 2 because the values contained in node D3 play no part 
in determining the proper Disk for 51: one disk access is 
saved by the Ordering Scheme. 
Step 3 

B-tree-node D3 now contains five (5) values because of 
the insertion of the value 59. The addition of the fifth value 
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and resulting fullness of the B-tree-node D3 is recorded. 
Nodes Dl and D2 are still less than full; therefore less than 
one-half of the parallel nodes are full, and there is no 
Split — Step 3 is unnecessary at this time. (FIG. 57) 
3.) Insert(53) at Disk 1 5 
Step 1 

The pattern of locating the proper B-tree- node has been 
well established at this point — see other examples. The 
correct G-node for insertion of 53 is the G-node in B-tree- 
node Dl with Range (50-59). 10 
Step 2 

53 falls in the bottom one-third of the Range (50-59); 
therefore Step 2 is unnecessary (FIG. 58). 
Step 3 

The Insertion of 53 into B-tree-node Dl causes Dl to be 15 
full. Node D3 is already full. Therefore, more than one-half 
of the parallel nodes are full, and we must perform a B-tree 
Split and a G-node Split. This requires accessing the data in 
node D on all three disks. 

Parallel node D contains two G-nodes: one with Range 20 
(50-59), the other with Range (60-78)[Max], The G-node 
with Range (50-59) contains 8 values; the other G-node 
contains only 6, so the G-node with (50-59) is chosen for the 
Split: the two resulting G-nodes have Ranges (50-54) and 
(55-59) (FIG. 59). The resulting B-tree-node configuration 25 
shows that parallel B-tree-node D contains 3 G-nodes and 
must be split (B-tree Split). The G-node with range (55-59) 
must be re -inserted at the Root- node A. All three Disks 
perform this Step in parallel. Re-insertion of the G-node 
(55-59) causes the Root-node A to Split (FIGS. 61 and 62). 30 

Preferred Embodiment 
Program Adaptation 

The sequential maintenance program to be adapted can be 
made parallel simply by modifying the S-nodes into 35 
P-nodes, grouping the P-nodes into G-nodes (the creation of 
a P-node is done along with the creation of the G-node that 
contains it), and then adding a few functions in addition to 
the original sequential functions. Fullness Rules and Order- 
ing Schemes may be chosen or defined for efficiency. The 
original sequential functions are used to create and maintain 
the data-structure configuration: these functions are simply 
modified to sort, search and arrange according to the rela- 
tionships between G-node Ranges, rather than the relation- 
ships between S-node element values. In the preferred 45 
instance, G-node Range R(X)<R(Y) if all of the elements x ( - 
in Range R(X) are less than all elements y, in G-node Range 
R(Y): this establishes the relationships between G-nodes in 
the adapted data- structures. The method of altering algo- 
rithms is generally to replace comparisons between x and y 5Q 
in the sequential algorithms with comparisons between R(X) 
and R(Y) for the parallelized functions. 
Function List: 

1. Create -G-node (element y) 

2. Find-G-node (element y) 55 

3. Search-G-node(G-node v, element y) 

4. Add-to-G-node(G-node v, element y) 

5. Split-G-node(G-node v) 

6. Semi-sort-G-node(G-node v) 6o 

7. Adjust-G-node-Ranges(G-node v) 

8. Insert-G-node(G-node v) 

9. Remove-G-node(G-node v) 

10. Resolve-Range-conflict(G-node u, G-node v) 

11. Remove-from-G-node(G-node v, element y) 65 
Some of the functions listed above (2, 8 and 9) call the 

slightly modified sequential functions for a given data- 
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structure. "G-node u,v;" and "element y, z, . . . etc. are 
variable declarations or parameters. 

Preferred Embodiment Program Adaptation 
Function Explanations: 

1. Create- G-node(element y). This function creates a 
G-node by creating one P-node per-processor in the 
same per-processor location in the data-structure. It 
places the element y in the P-node of the processor 
chosen to hold y. Any or all processors may place their 
own elements y t - in their own P-nodes as well. This 
function defines the G-node Range: because the new 
G-node will generally be in an undefined state, the 
G-node Range may be partially or fully undefined; this 
is represented in most cases, by the use of MAXVAL 
and/or MINVAL. If the G-node is the first created in the 
structure, its Range will generally be R(X)- 
{MINVAL, MAXVAL} for ordinal data. This function 
works cooperatively with the other processors. Because 
G-nodes are composed of P-nodes, this function is a 
parallel node creation function as well as a global node 
creation function. 

2. Find-G-node(element y). The find global node function 
is a searching function that locates a G-node with a 
G-node Range into which the element y falls; this 
function can provide individual access to each separate 
graph or data-structure, locating a G-node Range with- 
out involving the entire global data-structure. Sequen- 
tial data -structures that already have SearchO functions 
need only modify those functions to work with G-node 
Ranges as opposed to element values (using the range 
function R(G-node)). For sequential data-structures 
that normally have no SearchQ functions, knowledge of 
the sequential data-structure must be used to create a 
proper Find-G-node0 function; in such cases, the 
G-node found may be one of many possible G-nodes if 
the Ranges overlap. This function returns the G-node 
location found. After the G-node location is found and 
returned, this function may be combined with the 
Search-G-node() function to provide parallel access to 
the parallel data-structure. 

3. Search-G-node(G-node v, element y). This function 
searches the G-node v cooperatively for the element y 
as a parallel access function. G-node v obviously must 
have a Range capable of holding y. This function may 
be initiated by a given processor i and then have the 
other processors return the results of their search to the 
processor i; thus any one processor may search the 
entire parallel data-structure for an element y by (1) 
locating the proper G-node at will in its own separate 
graph and (2) performing a Search-G-node() in coop- 
eration if necessary thus accessing all of the separate 
graphs together as parallel data-structure. 

4. Add-to-G-node(G-node v, element y). The add to global 
node function is called after the appropriate G-node for 
element y has been located. This function inserts the 
element y into G-node v. This function may arrange the 
G-node elements in any way desirable for a given 
data-structure according to Rules for Fullness or Order- 
ing Scheme, or this function may simply place element 
y in an empty cell in the P-node of the requesting 
processor that is part of G-node v; if this is not possible, 
then the requesting processor may cooperate with other 
processors to find an empty cell in which to place y in 
G-node v. 

5. Split-G-node(G-node v). {returns new G-node} This 
function calls functions 6 and 7. This function is called 
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when G-node v is full. The first step is to call function 
(6) Semi-sort-G-node() which arranges the elements in 
G-node v such that they are split into two sets X,Y 
(XUY-W); the resulting sets are partially sorted such 
that every element x f falls into a G-node Range distinct 
from the Range containing all elements y f . Without loss 
of generality, we assume unique ordinal elements, an 
ordinal relationship of "less-than," and the preferred 
method of Range calculation for the data-structure: thus 
every element x ( - is less than every element y ( -; the set 
X is contained in cells v^j, the set Y in v. o2 (i taking 
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on the values 1 ^i^P). The second step is to create new 
P-nodes at all processors and move set X or Y into the 
new P-nodes at each processor i. The third step is to call 
function (7) Adjust-G-node-Ranges() which resets the 
G-node Ranges according to the new distribution of 
elements and creates a new Range for the new G-node. 
This function (Split- G-node0) rnay be called on a 
defined or undefined G-node; after the function ends 20 
there will be two G-nodes, one of which will usually 
remain where it was in the data -structure, the other 
must be reinserted by function (8) Insert-G-node() or 
placed appropriately. Generally, if the original G-node 
v was a defined G-node, then both resulting nodes will 25 
be defined; if not, then at least one of the resulting 
G-nodes will be partially defined. The defined G-node 
is reinserted (for example see (7) Adjust-G-node 
RangesO). 

6. Semi-sort-G-node(G-riode v). As explained above, this 
function divides or partially sorts the elements in 
G-node v and places the resulting distinct sets into the 
proper processors. This function sub-divides and dis- 
tributes the portion of data defined by the G-node 35 
Range, in essence creating new ranges. The function 
may also send the minimum and maximum values of 
the two sets to each processor (or other information for 
the calculation of Ranges, Fullness, Ordering, etc.). 

7. Adjust-G-node-Ranges(G-node v). The Adjust-G-node - 
RangesO function is key to the adaptation process, 
performing range determination to group data into 
value ranges; this function is a range addition and 
removal function that works in combination with the 
insert G-node and remove G-node functions. Like the 
Find-G-nodeQ function it depends on the configuration 
and rules of the sequential data-structure being adapted. 
Examples of Split-G-node() and Adjust- G-node - 
RangesO are gi ve n together because they are so closely 50 
related. There are different ways of adjusting Ranges 
for different data-structures. Also, the G-node v is not 
the only G-node which will have its Range adjusted; 
there may be adjustments on any nodes which have 
their Ranges wholly or partially dependent on G-node 55 
v. The goal is as maintain the rule which governs the 
relationships between nodal values by adjusting the 
Ranges to fit the new placement of the elements and/or 
G-node(s). The Adjust-G-node-RangesQ function can 
operate simultaneously but blindly on all processors. 60 
This function may use the minimum and maximum 
values of the elements of the G-nodes in addition to the 
values of old Ranges. When adjustments on each 
processor are made blindly, they are depended upon to 

be identicaj over all processors because they use the 65 
same values. Example: split and adjustment made in a 
parallel ordered list with N G-nodes. 
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Step 2: Split G-node D creating G-node V 
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Step 3: Re-insert V (Insert-G-node(V)) 

In this circumstance (ordered List) the re-insert is predict- 
able and obvious. Step 4 will adjust all G-node Ranges at the 
same time. 
Graph for Step 3 

(Note that the Range for G-node V is depicted although it 
is not calculated until step 4.) 
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G-node v's range by merging the range of G-node V into the 
range of G-node D. 



8. Insert-G-node(G-node v). This function works the same 
as the InsertO function for the sequential data-structure 
except that it uses the values of G-node Ranges to 
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Step 4: Set new G-node Ranges 

Under other circumstances, the new Ranges for D would 
be set as well as for V; after which, insertion of V would take 
place and a new resetting of Ranges done for the insert; here 
all Ranges are set at Step 4 because the placement and 
Range-setting are obvious. However, the pattern should be 
clear: 

Ranges: 

R(C): unchanged— {50,74}. R(Q's second value R(C 012 ) 
is still based on R(d 011 )which is unchanged. 

R(D): R(d 012 ) is changed: {75,89}, R(d 012 )=R(v 011 ) 
-1-90-1-89. 

R(V): {90,109} 

R(v o11 ) ■minimum value of V=90 

R(v. ia )-R(e« 11 )-l-110-l-109 

R(E) : unchanged. 

For this parallel ordered list data-structure, the formula for 
G-node Ranges R(X) (X taking on values 2<X<(N-1) where 
N=number of G-nodes) is 



R(X)- { 

R(x ou ) » minimum element of X, 
R(xq 12 ) - R( (X + 1) 011 - 1} 
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The formulas for other data-structures, though more com- 
plex in general, are very similar. At the extreme ends of the 
spectrum for the parallel ordered list we have special values: 

For N G-nodes we have: 

R(a 0l >MINVAL 

R(n ol >MAXVAL 
The G-nodes A and N are partially defined G-nodes. 

The above example shows G-node addition, adding one 
new G-node and one new range to the parallel data-structure. 
If the new G-node v were then removed, the process would 
simply be reversed, removing G-node v and removing 



arrange and relate the G-nodes rather than element 
values to arrange S-nodes (using the range function 
30 R(G-node)). 

Adding a new node to a sequential data-structure gener- 
ally requires reconfiguring the links to represent changes in 
the logical relationships. Each decision (e.g. IF statement) in 
the sequential algorithm can be modified to use the range 
35 function R() to produce or adjust the proper relationships 
and position the nodes within the order of the data-structure 
by range. 

All processors may perform this function simultaneously 
and blindly; however, in the event that two G-nodes with 
40 overlapping Ranges collide, the Resolve-Range-conflictO 
function may be called; Resolve-Range-conflictO is coop- 
erative. In most respects, this function, Insert-G-node0, is 
identical to the sequential InsertO- 

9. Remove-G-node(G-node v). All of the statements made 
about function (8.) Insert-G-node0 also apply to this 
function with respect to the sequential RemoveO func- 
tion. In most respects, this function is identical to the 
sequential RemoveO- 

10. Resolve -range-conflict (G-node u,v). This function 
resolves the problem of overlapping G-node Ranges. A 
difficulty presents itself if a data-structure creates over- 
lapping Ranges because of non- contiguous data place- 
ment. Two G-nodes may try to occupy the same or 
adjacent positions in the data-structure. If two such 
G-nodes conflict, then the elements in the G-nodes 
must be divided between them in such a way that the 
new ranges calculated for an element arrangement do 
not overlap. This function may determine ranges and 
force re-distribution of the element values or it may 

eo semi-sort the elements across the nodes u and v forcing 
re-determination of ranges based on the semi-sort. 

11. Remove-from-G-node(G-node v, element y). The 
remove from global node function is called after the 
appropriate G-node for element y has been located. 

65 This function removes the element y from G-node v. 
This function may arrange the G-node elements in any 
way desirable for a given data-structure according to 
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Rules for Fullness or Ordering Scheme, or this function 
may simply remove element y from the P-node that 
contains it. 

Preferred Embodiment 
Generalized Parallel Method 

The following is a generalized parallel method which uses 
the previously defined functions to create the parallel data- 
structures. The configurations of the data-structures in ques- 
tion are determined by the Insert0 and Remove0 functions 
of the sequential data-structures that have been modified 
slightly to use G-node Ranges to establish ordinable rela- 
tionships. The slightly modified functions are called from 
within functions Insert-G-nodeO and Remove-G-node(). 
The indications of steps 1, 2 and 3 to the left of the 
pseudo-code are the steps explained in the Verbal Descrip- 
tion section. 

Preferred Embodiment 
Function Para 11 el -Insert (element y): 

This function is called by any processor wishing to insert 
the element y into the parallel data-structure. It is assumed 
that the first G-node of the data-structure has already been 
created. 



Parallel- Insert (element y) 
G-node u,v 

Step 1 — v = Find-G-node(y) 
Step 2 -* if (there is an empty cell in F-node v) 
then 

place y in F-node v 
else 

Add-to-G-node(v,y) 
end if 

Step 3 -»■ if (G-node v is full) 
then 

u - Split-G-node(v) 
Insert-G-node(u) 
Adjus t-G- node-ranges(u) 
e nd if 

END FUNCTION 



Preferred Embodiment 
Function Parallel-Remove (element y) 

This function finds and removes a specific value y. Some 
data-structures remove elements by location (example: 
priority-queue); in such cases, the Find-G-nodeQ function 
may be adapted to find the proper location, and then the 
G-node may be sorted or searched for the appropriate value. 
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encapsulated in the sequential InsertQ and RemoveQ 
functions, or that they can be adapted and used in the same 
manner as those functions. The Parallel-InsertO and 
Parallel-RemoveO functions describe the formation and 
5 functioning of the parallel data-structures in their important 
aspects. 

Preferred Embodiment Program Adaptation 

10 Find Function 

This section contains a small section of C like Pseudo- 
code intended to illustrate the simplicity of adapting single 
processor functions to multiple processor functions. The 
code is not intended to be in perfect C syntax, only to give 

15 the basic concepts that could be used for creating the Find 
function in parallel form. Nor is the code intended to be the 
only possible embodiment of the Find function (Find-G- 
node). This example should also help to illustrate the nature 
of the "slightly adapted" single processor Insert and Remove 

20 functions mentioned previously because the majority of the 
work done for those functions is the location of the proper 
position in the data -structure for an element value. 

The pseudo-code shown is a Find function for a binary 

25 search tree; the concepts expressed are given to be useable 
for other data-structures as well. The pseudo-code shows 
that the primary difference between single and multiple 
processor functions is the replacing of the comparisons of 
element values with comparisons of G-no de-Ranges: it is 

30 easiest to illustrate this by taking advantage of operator 
overloading with regard to the <,>, and == operators — 
these operators are assumed to work equally well on element 
values and G-node -Ranges. 
Single Processer Definitions and Pseudo-code 

35 



struct node_st{ 

key_type key; 
node_st "leftchild; 
node_st *rightchi1d; 

} 

node_st *Find(node_st *node, key type clement) 

{ 

if (node->key — element) 

return(node); 
if (node->key < element) 

ieturn(Find(node->rightchild ) element)); 
else 

letur n (Find(node->leftch ild,el emen t)); 



P?r?llel-Remove(elementy) 

G-node v; 
Step 1 v » Find-G-node(y) 
Step 2 — if (G-node v is found) 

then 

Search-G-node(v,y) 

if (y is found in G-node v) 

then 

Remove y from cell and 
send to proper processor 
end- if 
end-if 

Step 3 -* if (G-node v is empty) 
then 

R emo ve- G- no de(v) 
Adjust- G-node- Ranges (u) 
end-if 

END FUNCTION 
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Multiple Processer Definitions and Pseudo-code 



x m Maximum number of elements per P-node; 
struct Range{ 

key_type lowerbound; 

key_type upperbound; 

} 

struct Pnode_st{ 

node_numbcr int; 
key_type key[X]; 
Pnode_st Neftchild; 
Pnode_st *rightchild; 
Gnode Range Range; 

} 

Pnodc_st *PFind(Pnode_j5t 'node, key_typc clement) 
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It is assumed in the preferred instance that most sequential 
functions for the adaptable sequential data-structures can be 



{ 



if (nodc->Gnode Range element) 

jeturn(node); 
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-continued 

if (node->Gnodc__Range < clement) 
return (PFind(n ode- >rightchild ,element)) ; 

else 5 
returnfPFind^ode-^eftchild^lement)); 

} 



Preferred Embodiment 10 

Data Model 

FIG. 63 depicts a possible data model for an embodiment 
of the present invention. Each box is a data entity expressing 
an aspect of the program design for the embodiment. Each 
data entity is a set of rules, data-structure, record, other set is 
of data and/or associated maintenance functions used to 
create the parallel maintenance system. No particular mod- 
eling technique is implied. 

Hie data model shown is only one possible model, given 
to express the parallel system components from a data model 20 
perspective. Hie relationships are described below. 
A.1 — indicates a set of range adjustment rules may com- 
prise or relate to multiple sets of range addition rules; 
A.2 — a set of range adjustment rules may comprise or 

relate to multiple sets of range removal rules; 25 
A.3 — a set of range adjustment rules may comprise or 
relate to multiple sets of range breadth adjustment 
rules; 

D.l — a set of range determination rules may comprise or 3Q 
relate to multiple sets of range adjustment rules; 

G.l — a set of adjustment need rules applies to many 
G -nodes; 

G.2 — a set of range determination rules applies to many 
G-nodes; 35 

G.3 — a G-node and G-node Range have a one-to-one 
relationship; 

GA — a logical relationship may relate many G-nodes, and 
a G-node may have many logical relationships; 

G.5 — a G-node contains many P-nodes; 40 

G.6 — a set of arranging rules applies to many G-nodes; 

P.l — a P-node contains many data value storage entities 
or elements; 

R.l — a set of range relation rules applies to many ranges; 45 

R.2 — a set of range relation rules applies to many logical 
relationships; 
Notes on B-tree Section 

The two rules (Ordering Scheme and Rule for Fullness) 
used in this section are not the only possible rules for storing 50 
a parallel B-tree or other data-structure on multi -component 
Dynamic Access Storage Devices such as disk drives. Many 
other Fullness and Ordering rules may be used (defined), but 
the essential pattern of the present method remains the same. 
Well known methods exist for storing information on the 55 
locations of data storage blocks. These can be used to store 
information on the fullness of the parallel B-tree-nodes. A 
bit-map stored in memory would suffice, as would a bit-map 
stored in high-speed secondary storage (e.g. a faster, more 
expensive disk drive than the others used). Many other 60 
possibilities exist for the storage of the information; the only 
requirement is that it allows the determination of the fullness 
of B-tree-nodes and G-nodes without accessing every drive. 

It should also be noted that the preferred embodiment and 
the data-structures and maintenance routines that result from 65 
it function by sending the locations of data -structure nodes 
between processing elements. This may require a method of 
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storing or calculating the location of the memory or disk 
space allocated for the different portions of data-structure 
storage: one such method would be keeping an index to a 
storage space as it is allocated for nodes one each device; 
another would be the storage or definition of explicit point- 
ers in each P-node to the other P-nodes within a given 
G-node; for disk drives, many indexing techniques already 
exist to perform functions very similar to this. Some such 
techniques are described in "Operating System Concepts" 
by Silberschantz and Galvin, Addison-Wesley Publishing, 
1994 (fourth edition). 

The amount of data stored on the B -trees in these 
examples is small. It is not reflective of the size of B-trees 
on real systems. In addition, the nodes themselves are 
relatively small, and the methods of storing data from 
node-to-node could make use of additional techniques to 
improve efficiency. Commonly known techniques for single- 
processor B-trees include the techniques used for B* trees 
and B+ trees. Also, the use of overflow blocks and tech- 
niques derived from B* and B+ trees could be added to the 
examples given here. 

Other Embodiments 
Definition of G-nodes for the B+ Tree 

A variation on the B-tree is the B+ tree. The following 
describes one embodiment of the parallel B+ tree. The file 
structures book referenced in this application defines a B+ 
tree as a B-tree for which data-file record pointers are stored 
only at the leaves of the tree. This indicates that the 
definition of the B-tree nodes in a B+ tree takes two forms: 
one form for the non-leaf nodes and one form for the leaf 
nodes. The same may be done for the definition of G-nodes 
and their Ranges for parallelized data-structures. I use the 
parallel B+ tree to illustrate this concept. 

Because the elements (tuples) stored in the B+ tree only 
contain data-file record pointers at the leaf-nodes of the tree, 
the G-node Ranges in the non-leaf nodes do not require the 
storage of actual tuples containing record pointers. This 
means that the only useful information in the non-leaf nodes 
is the storage of G-node Ranges: the Ranges are used to 
locate the desired leaf-nodes. B+ tuples are never inserted 
into no n -leaf nodes and therefore the parallel Ranges need 
not be defined to contain values. Single values may be stored 
in the non-leaf nodes to represent non-leaf Ranges (the 
minimum value of the Range may equal the maximum value 
of the Range). 

The leaf nodes of the B+ tree have G-nodes and G-node 
Ranges defined in the manner described in previous sec- 
tions. Insertions of new Ranges into the no n -leaf nodes 
occur at the time of B-tree node splits. All non-leaf Range 
values are based on the values contained in the leaf-nodes of 
the tree. 

Other Embodiments 

Complex Ranges 

More complex range calculations than those described in 
other sections are possible and justifiable. For example, an 
additional embodiment of an adapted AVL tree may be 
created by the use of a different set of range relation rules or 
range determination rules. The AVL tree previously 
described herein used range relation rules defined in a linear 
contiguous fashion: R(A o1o ) <R(B o1o ) if and only if R(A el2 ) 
<R(B o11 ) (i.e. Max of A less than Min of B); this produced 
a distribution of the total data set such that the possible 
storage of a given value x on a processor was only deter- 
minable by locating its G-node Range. 

Imagine instead a range function such that the highest 
order digit is ignored. Thus, a range (#50-#70) could contain 
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values 150,250,350,165,266,360,370, etc. In addition imag- 
ine an Ordering Scheme such that processor 1 contains only 
values whose first digit is 1, processor 2 only values whose 
first digit is 2, etc. This combination of range function R() 
and Ordering Scheme create a parallel structure such that a 
given value is known to be stored on a given processor i or 
not at all before search begins (e.g. the value 563 will be 
found on processor 5 or not at all, the value 828 on processor 
8 or not at all, etc.) If leading zeros are assumed, then the 
combination also creates a data structure composed of ten 
separate structures, each having its own range of possible 
values (i.e. (00(HD99),(100-199),(200-299), etc.). 

An advantage gained by the grouping of elements into 
ranges as described above while simultaneously grouping 
the elements by G-node Ranges is that the elements are 
sorted by high order digits and sub-sorted by low order digits 
and simultaneously sorted by low order digits and sub-sorted 
by high order digits. 

Such sorting may even be useful if the P-nodes related by 
high order digits are grouped and contained on a single 
graph, rather than the multiple graphs described in other 
embodiments herein. 

Such complex range calculation as described above shows 
a more advanced grouping of elements by range than other 
embodiments described herein. The elements contained in 
the data structure are organized in two fashions: by high 
order digits and low order digits. This grouping illustrates an 
element's or a P-node's membership in multiple complex 
sets. Another instance (a refinement or improvement of the 
concept of membership in multiple sets) could provide a 
P-node membership in a plurality of sets, each set organized 
for access by different aspects of the data stored (e.g. last, 
first and middle name, etc.). 

FIG. 65 shows nine P-nodes, all related by commonality 
of complex G-node ranges: the nine P-nodes are all part of 
a complex G-node. The ranges may of course be stored 
implicitly and partially calculated by processor number; 
however, on the diagram, they are explicitly listed. Pound 
signs indicate wildcards; numeric entries separated by 
dashes indicate ranges; the first two entries may be com- 
bined to form an ordinal range and then further refined by 
adding the last entry: therefore processor 5, having complex 
G-node Range (#50-#70,2##,##4-##6) may contain num- 
bers between 250 and 270 whose last digit is between 4 and 
6. If the nine processors depicted are on a two dimensional 
mesh of processors, then each linear array may be accessed 
according to the common key attribute being sought by a 
user or system process (e.g. any key being sought between 
100 and 199 will be found on processor 1, 4 or 7). The rules 
for insert (i.e. range relation rules) for the data structure in 
FIG. 65 are assumed to apply to the "#50-#70" portion of 
the complex range: that is, the links are configured by that 
portion of the complex range such that if x>y then R(#x) 
>R(#y). FIG. 65 represents a parallel data -structure consid- 
ered to have two dimensions at the processor or storage 
level; the possibility of more dimensions is implied. 

The great variety of combinations offers a wide range 
possible uses according to the needs of a given system or 
data-structure. 

Other Embodiments 

Dependant Ranges 

Imagine a decision-tree used, for instance, to play chess. 
A given function can identify when a piece on the board is 
threatened by an opposing piece. This increases the priority 
of moving that piece. A given node in the tree representing 
this situation on the board will have a wider range of 
possible moves than nodes dependant on the given node. 
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Each possible movement of the threatened piece is within 
the range defined by its identity as a move of the piece and 
its dependency on other nodes. This range may be distrib- 
uted to multiple processing elements according to range 
5 determination, Rules for Fullness and Ordering Schemes 
defined for the decision tree algorithm's use in a parallel 
environment. In this instance, the values and ranges for the 
nodes are created together, rather than input and inserted or 
removed. 

10 

Addendum 

G-node Ranges 

The calculation of G-node-Ranges is key to the entire 
process. Generally, it is simply a matter of determining 

15 which nodes on a data-structure most closely determine the 
positions of the other nodes; that is to say which nodes 
contain values that determine the values that may be con- 
tained in a given node. Partially defined G-node Ranges may 
frequently be found at the extreme points of the data- 

2Q structure; for instance, the root and leaves of a heap, or the 
right most and left most node of a binary search tree, 2-3 
tree, or B-tree. 

Addendum 

^ Ordinal vs. Ordinable Data 

Most of the Examples for this application are given for 
ordinal data types. However, any data-structure having the 
capacity of the data values to be grouped into suitable 
G-node Ranges will be adaptable. If the G-node Ranges can 

30 be constructed such that the nodes which branch off from the 
members of the Range can be said to relate to all of those 
members in a similar fashion, then the parallel or global 
links between nodes are justified and will be consistent with 
the data-structure and/or method rules. Such data-structures 

35 and/or methods are adaptable by this process. 

Addendum 
Use of Space/Merging G-nodes 

Because the G-node -RemoveQ function is only used on 
sufficiently empty G-nodes, it is possible to have large 

40 data-structures with large numbers of partially empty 
P-nodes; however, the present method is capable of adjust- 
ment to make efficient use of space. A G-node-MergeO 
function to merge two sparsely populated G-nodes into one 
would be one way to resolve this problem; another would be 

45 to alter the Rule for Fullness, changing the lower limit on the 
number of elements in a G-node to half P and remove 
P-nodes that break the rule, reinserting their elements. 

Addendum 

50 Contiguity of Data Distribution 

Non-contiguous data distribution like that of a heap 
makes difficult the efficient search, and therefore efficient 
placement, of elements into unique G-node Ranges. One 
solution to this is the Resolve-Range-Conflict() function; 

5S however, for those data-structures that can be forced into a 
contiguous configuration and thereby make efficient search- 
ing possible, this function may not be necessary. Non- 
contiguous methods of defining ranges or distributing values 
may also be defined and used for the present invention. 

60 

Addendum 
Data-Structures/Methods 

The data -structures and methods listed in this application 
are only examples of adaptations from sequential to parallel. 
65 Many other data-structures and methods not listed can be 
adapted through this process. No restriction on types of data 
stored or manner of storage is implied. Many distributed or 
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parallel data -structures may be created in accordance with them among the disk-packs one-at-a-time. However, disk 

the principles of the present invention. The present invention access is much slower than memory access, so the queuing 

may also be used to create new data-stmctures without serial and distribution of the requests in the server's memory 

counter-parts. might take 10 microseconds per request. Therefore, the last 

5 request to disk would be made 30 microseconds after the 

Examples of Application of Preferred Embodiment user made the request, jf we assume that each key is stored 

m „ „ , on the fourth level down i n the B-tree . an d we assume th at 

The following two examples illustrate the functioning of each disfee^^juTTes TFS iUiseconds, then the last 

two adapted parallel data-structures on a working system. "quest for a key is fulfilled 40 milliseconds +30 mic rSse -c- 

Two examples are given to show that the parallel routines ends after the request. If the B-tree were stored on a singl e 

and data-stmctures may function on a variety of different d isk, then in the worst case, each kev req uest would t^elQ 

systems. Specifically, one Example is stored in memory on jiw rnn ^ vd free wu J^^qu^ -120 miUliec- 

a parallel-processing hypercube network, and the other is onds + oueuinetime — " 

stored on disk. Although the data-structures can be used by precise make-up of the two systems described above 

any program, whether batch or on-line, the examples illus- diffe bm {n ^ c ^ ^ wofk fa successfull distributed 

trate the functioning of the data-structures by assuming gt processmg _ elements , ivi better res ' p 0nse time . 

multiple users accessing the same system simultaneously. ^ ^ ^ preds£ make _ up of ^ systemg 

Example of Application 1 depicted are given only for the purpose of example. The 

times given are estimates and the calculations are simple 
FIG. 66 shows a parallel machine with 128 processors 2Q illustrations of the functioning of the types of systems that 
connected by a hypercube network. A powerful machine CO uld make use of the Adapted data-structures, 
such as this could serve a great number of users Conclusion, Ramifications and Scope 
simultaneously, but only three are depicted. Each of the Thus the reader can see the results of combining the 
terminals (numbered 1, 14, and 127) have allocated their various aspects of this method of creating and using parallel 
respective processors 1, 14, and 127, and are conducting ^ data-structures. The present invention provides a great vari- 
on-line accesses to a file located on disk in a hashed file with ety 0 f possible combinations of rules for fullness of nodes, 
secondary keys stored in an Adapted parallel data-structure: rangc determination, parallel and global node definition, and 
the keys are stored in the memories of the various processors data distribution such that each aspect of the invention, in 
distributed throughout the hypercube on a parallel m-way addition to others not fisted, may be used in combination 
search tree. Each processor has 16 Mega-bytes of memory. 3Q with one or more of the others, or alone, to enhance 
Each processor stores approximately Vnsth of the file's keys performance of the parallel data structures and define new 
in its local memory. If we assume that the search-tree can data-structures, including parallel forms of serial data- 
store each key using 20 bytes of memory (including structures and many others. 

pointers, indexes, etc.), and we also assume that each xhe combinations of components in the embodiments 

processor uses a maximum of approximately 1 Mega-byte of 35 herein are not the only combinations possible. Not only are 

RAM, then approximately 50,000 keys may be stored on different combinations possible, but different instances of 

each processor: 50,000x128=6,400,000 keys may be stored tne components themselves, such differences exemplified by 

in parallel memory. The same tree stored in a single pro- the various rules for fullness, ordering schemes, and range 

cessor's memory would require 128 Mega-bytes of memory calculations described and contrasted in this application, 

and probably force the storage of the tree onto disk. In ^ though not limited to those descriptions or those compo- 

addition, each user on the system may search the tree nents. 

simultaneously: little or no queuing will result from simul- while my description above contains many specifics, 

taneous accesses to the tree, unless more than 128 users are these snould not be construed as limitations on the invention, 

logged on. but ra ther as an exemplification of preferred embodiments 

(Note that FIG. 38 used for this example was designed for 45 thereof. Many other variations are possible. Accordingly, the 

3 processors in the General Example: the key ranges and size scope of the invention should not be limited to the embodi- 

of the tree are accordingly small, and the processor numbers ments illustrated; the scope of the invention should be 

illustrated are different.) If we assume that the processors 1, determined by the appended claims and their legal equiva- 

14, and 127 contain the search-tree nodes depicted in FIG. lents. 

38, then User 1 could request key 56, User 14 could request 50 I claim: 

key 15, and User 127 could request key 10 simultaneously. 1. A method of maintaining order for data on a computer 

Each processor would then access two nodes of its own system by creating a parallel data-structure, said data stored 

locally stored tree to reach the bottom level, send requests on one or more memory storage means, accessed by one or 

for keys to other processors as necessary, and receive replies. more processing elements, said order represented either 

The same values stored on a single-processor tree would 55 explicitly or implicitly as a graph or graphs containing nodes 

require more accesses (a taller tree) and queuing. In either that represent sets of data values grouped into ranges and 

case the disk could be accessed after retrieval of the keys incident links that represent logical relationships between 

from the search tree, and the users would receive the said sets of data values, the nodes and links either explicitly 

appropriate records from disk. or implicitly stored on said memory storage means, said 

60 memory storage means and said order maintained by said 

Example of Application 2 processing elements, 

FI G. 67 shows three terminals connected to a server, the i. wherein said memory storage means is divided into 

serve! bTconnected to three disk-packs. A parallel B-tree logically corresponding storage units or partitions, said 

distributed amongst the three disk-packs can be accessed partitions defined by a parallel storage location of said 

simultaneously by each user. If Users 1, 2 and 3 all make 65 nodes on said memory storage means, and 

requests to access the B-tree index at the same time, then the ii. wherein one form of said logical relationship is a serial 

server would have to queue these requests and distribute or local relationship relating two Or more differing said 
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ranges to each other according to their differences, said 
local relationships relating said ranges within said 
partition of said memory storage means, and 

iii. wherein a second form of said logical relationship is 
a global relationship relating two or more similar said 
ranges to each other according to their similarity or 
commonality, said global relationships relating said 
ranges between multiple said partitions, and 

iv. wherein said nodes form individual nodes having said 
local relationships with other said individual nodes 
with different said ranges, said individual nodes also 
referred to as parallel nodes, and 

v. wherein said nodes form global nodes comprising 
multiple said individual nodes having said global rela- 
tionships with other said individual nodes, said global 
nodes thereby comprising multiple said individual 
nodes with the similar or common said ranges, each 
said individual node within a given said global node 
having the common range, such that said global node is 
a composite of said individual nodes, 

the method comprising the steps of: 

a. determining said ranges for said sets of data values, 

b. assigning said ranges to said nodes and assigning 
said sets of data values to said nodes by determining 
the ranges into which they fall, 

c. positioning said individual nodes within said order 
using said links by determining said local relation- 
ships and said global relationships between said 
ranges, 

d. storing said nodes in different portions of said 
memory storage means such that said ranges with 
said commonality are stored on multiple said indi- 
vidual nodes, each said individual node with the 
common said range stored in a different said portion 
thereby defining said partitions and said global nodes 
comprising a plurality of said individual nodes, and 
thereby storing said local relationships as explicit or 
implicit said links within said partition and said 
global relationships across multiple said partitions, 

whereby a combination of the local and global relationships 
creates a composite global data-structure comprising mul- 
tiple serial or local data-structures, and whereby said data is 
maintained in said order on each of said partitions in a 
uniform manner and on all of said memory storage means 
combined, and a plurality of system processes are enabled to 
access the data values simultaneously by accessing said 
individual nodes in a given said partition of choice, thus 
gaining access to said global node having desired said range 
and to the global data-structure as a whole. 

2. A method as recited in claim 1 wherein said order is 
expressed as a plurality of separate said graphs, each said 
graph stored separately within said memory storage means, 
each said graph arranged by arranging rules of an adapted 
sequential data-structure, thus creating said parallel data- 
structure capable of the same functions as said sequential 
data-structure in a distributed environment. 

3. A method as recited in claim 2 wherein each said 
individual node contained in a given said global node has an 
identical said range to all other said individual nodes in the 
given global node and wherein all the logical relationships 
between all said individual nodes belonging to the given 
global node and all said individual nodes belonging to 
another said global node are identical. 

4. A method as recited in claim 2 wherein said memory 
storage means is composed of a plurality of disks, and said 
order is defined by a set of rules for maintaining a serial 
b-tree as adapted to function using said ranges in a parallel 
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environment, thus creating a plurality of b -trees located on 
said disks, each said b-tree represented as a separate said 
graph composed of said individual nodes, every said indi- 
vidual node contained on a given said portion of said disks 
5 belonging to a different said global node from all other said 
individual nodes contained on the given portion of said 
disks. 

5. A method as recited in claim 2 wherein the plurality of 
separate said graphs is created and maintained by the present 

10 method and each of said separate graphs has an identical 
structure as every other said separate graph, said structure 
defined by an identical positioning of each said individual 
node contained in a given said global node to each other said 
individual node contained in each adjacent said global node, 

15 that is, each said individual node belonging to the given said 
global node holds the same position within each said sepa- 
rate graph as every other said individual node belonging to 
the given said global node holds in its said separate graph, 
whereby each said separate graph is identical in form and 

20 function to each other said separate graph and is thus able to 
function as a separate data-structure on a single said pro- 
cessing element and is also able to be combined with the 
other said separate graphs and function as said parallel 
data-structure on multiple said processing elements. 

25 6. A method as recited in claim 1 further employing the 
steps of: 

a. identifying where said range is too broad for a given 
said global node thereby indicating a need to split said 
range, 

30 b. upon the indication of need to split said range, splitting 
said range by performing the range determination on 
the range being split and adjusting adjacent said ranges 
as necessary thereby creating at least one new said 
range, assigning the new range or ranges to a new said 

35 global node or nodes and performing the positioning to 
position the new nodes and existing nodes as necessary 
using said links thereby adding the new nodes to said 
order and maintaining said order, 

40 c. identifying where said range is too narrow for a given 
said global node thereby indicating a need to broaden 
said range, 

d. upon the indication of need to broaden said range, 
performing the range determination to adjust said 

45 ranges for adjacent said ranges as necessary, removing 
the global node containing the too narrow range if 
necessary, and performing the positioning as necessary 
to reconfigure said links for remaining said nodes 
thereby removing appropriate said nodes from said 

50 order and maintaining said order, 

e. upon the indication that said range or ranges are too 
broad or too narrow, adjusting said range or ranges by 
performing the range determination thereby adjusting 
said range or ranges to proper breadth, 

55 whereby said order is manipulated using said ranges, and 
said logical relationships are manipulated as necessary to 
change data storage patterns while maintaining said order of 
said data. 

7. A method as recited in claim 6 wherein the range split 
60 is performed by creating a new dependent range, said new 
dependent range based on the range being split, at least a 
portion of said new dependent range being beyond the range 
being split thus representing an extension of the range being 
split and narrowing the range being split by combining the 
65 ranges, 

whereby the range being split is narrowed by combining the 
range being split with said new dependant range, the com- 
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bination of two ranges representing the combination of two 
restrictions and therefore becoming more restrictive or nar- 
rower. 

8. A method as recited in claim 6 wherein the identifica- 
tion that the range is too broad is achieved by a rule for 
fullness of said global nodes that employs a measurement of 
the number and positions of the data values within said 
global node, the identification of the too broad range defined 
by an excess of the data values, said excess indicating that 
said global node is sufficiently full for the range split and 
thus for the addition of the new node, and wherein the 
identification that the range is too narrow is performed by 
the measurement of the number and positions of the data 
values within said global node, the identification of the too 
narrow range defined by an insufficiency of the data values, 
said insufficiency indicating that said global node is suffi- 
ciently empty for the removal of the node, and further 
employing the steps of: 

a. locating a proper said global node with a proper said 
range to contain desired said data values by traveling 20 
along said links and choosing a path through said links, 
said path determined by using said ranges assigned to 
said global nodes, 

b. upon determination of the proper global node, deter- 
mining proper said individual node within the proper 
global node to contain a given said data value, 

c. upon determination of the proper individual node, 
adding or removing said given data value to or from the 
proper individual node thereby adding or removing the 
given data value to or from the proper global node, 

d. upon the addition or removal of the given data value, 
determining if the proper global node is sufficiently full 
or sufficiently empty, 

e. upon determination that the proper global node is 35 
sufficiently empty for the global node removal, per- 
forming the global node removal wherein the global 
node removal redistributes the data values as necessary 
within their respective said ranges, 

f. upon determination that said global node is sufficiently 40 
full for the global node addition, performing the global 
node addition wherein the global node addition splits 
the range of the sufficiently full global node and redis- 
tributes the data values as necessary within their 
respective said ranges thereby splitting the sufficiently 45 
full global node and adding said new global node or 
nodes to said order. 

9. A method as recited in claim 8 wherein said parallel 
data-structure is adapted from a set of ordering rules of a 
sequential data-structure and wherein said parallel data- 50 
structure is maintained by said processing elements as 
controlled by a parallel maintenance process adapted from a 
sequential algorithm for maintaining said sequential data- 
structure by utilizing the same said ordering rules applied to 
said ranges rather than applied to individual said data values. 

10. A method as recited in claim 9 wherein the data 
maintained are ordinal and wherein said ranges defined for 
said global nodes are unique, non overlapping said ranges 
covering the expanse of said data. 



individual node is used to derive a second memory address 
allocated for a second said individual node within the same 
said global node, 

whereby locating one said individual node within a given 
said global node enables said processing elements to easily 
derive the locations of other said individual nodes within 
said global node. 

13. A method as recited in claim 1 further employing a set 
of rules for arranging the data values within said global 
nodes to provide efficient locating means to locate within 
said memory storage means an exact said individual node 
contained in said global node that could contain a given said 
data value within said range. 

14. A machine to maintain an order for data on a computer 
system containing one or more processing means, one or 
more memory storage means, and communication means 
linking said processing means and said memory storage 
means to form said computer system comprising: 

a. range determination means to group said data into 
ranges, each said range capable of being arranged in a 
sequence or sequences with other said ranges such that 
said range determination means groups said data into 
multiple said ranges and such that said sequences 
between said ranges thereby arrange the data said 
ranges contain, 

b. distribution means to subdivide and distribute each said 
range to subsets that define subdivisions stored on 
multiple parallel or individual nodes on said memory 
storage means, 

c. composite global nodes containing the distribution of a 
given said range, said global node comprising multiple 
said individual nodes, each said individual node storing 
a portion of said range defined by said subdivision, 

d. relation means to define logical relationships between 
said global nodes and logical relationships within said 
global nodes by said ranges, wherein said logical 
relationship between said global nodes is defined by a 
difference between said ranges, such that if said relation 
means compares a given said global node with another 
said global node or their component said individual 
nodes, then said relation means achieves equivalent 
comparison results indicating said difference between 
said ranges, and wherein said logical relationship 
within said global node is defined by a commonality, 
such that component said individual nodes contained 
within said global node all have the same said logical 
relationship indicating said commonality in said range, 
such that said processing means are enabled to arrange 
said individual nodes with each other using said dif- 

50 ferences and enabled to arrange said individual nodes 
within said global nodes using said commonality, 
whereby said order is expressed by grouping said data into 
said ranges and defining said logical relationships between 
said ranges, and whereby said ranges are able to be distrib- 
55 uted within said computer system creating an arrangement 
of said data providing the order for said data such that it is 
easily accessed and maintained by multiple system pro- 
cesses. 



25 



30 



15. A machine as recited in claim 14 wherein said 

11. A method as recited in claim 10 wherein the method 60 subdivisions are grouped into separate sets, each said set 
is used to maintain key values on a distributed database that having its own valid said logical relationships between said 
are accessed by said plurality of system processes or a subdivisions and therefore between said ranges, said indi- 
plurality of users. vidual nodes, and said global nodes, each said separate set 

12. A method as recited in claim 1 wherein said global defining a separate graph stored on a division of said 
relationships are expressed by a specific said parallel storage 65 memory storage means, thereby creating a plurality of said 
location of said nodes within said memory storage means separate graphs, each said separate graph expressing said 
such that a first memory address allocated for a first said order, and all of said separate graphs together expressing 
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said order thereby creating a parallel data-structure, and 
further including: 

a. individual access means providing an access to an 
individual said separate graph as an individual expres- 
sion of said order, said separate graph accessed as a 5 
valid separate data -structure, and 

b. parallel access means to access multiple said separate 
graphs together as parallel expressions of said order 
wherein the access to the individual said separate graph 
enables access to other said separate graphs, thereby 10 
accessing multiple said separate graphs together as said 
parallel data -structure, 

whereby a plurality of said processing means, system pro- 
cesses or users are enabled to efficiently access said data 
through said separate graphs using separate access paths and 15 
accessing separate parts of said memory storage means to 
achieve consistent results. 

16. A machine as recited in claim 15 further comprising: 

a. range measurement means to determine if said ranges 
need adjustment providing said range determination 20 
means with cause to regroup said data, 

b. range addition and removal means to add new said 
ranges and remove old said ranges to and from said 
graphs wherein said old ranges are deleted or merged 
with other said ranges, and said new ranges are derived 25 
or split from said old ranges and added in addition to 
said old ranges, and 

wherein said range measurement means determine if the 
ranges must be added or removed, said range addition and 
removal means add or remove said ranges, said distribution 30 
means redistribute the data defined by said ranges as 
necessary, and said relation means reconfigure said separate 
graphs by adjusting the logical relationships between said 
ranges, 

whereby said processing means are enabled to alter a 35 
configuration of said parallel data-structure while maintain- 
ing said order. 

17. A machine as recited in claim 15 wherein said 
distribution means distributes the data to provide said pro- 
cessing means a plurality of dynamically chosen access 40 
paths to a given said subdivision or distributed part of range 
for use by the parallel and individual access means, 
whereby said processing means are enabled to choose freely 
which said separate graph to use for access, accessing a 
chosen said separate graph until a given said distributed part 45 
of range is required, and 

whereby said processing means are enabled to efficiently 
distribute work through the free choice of which said 
separate graph to use for access to said data. 

18. A machine as recited in claim 16 wherein said 50 
individual access means searches a given said separate graph 
for a desired said range thereby identifying a proper said 
global node to contain the desired range whereupon said 
parallel access means locates a proper said subdivision or 
subdivisions within said proper global node, said proper 55 
subdivisions partially or completely containing the desired 
range, the identification of the desired range allowing access 

to desired said data whereupon said processing means uses 
said data and may therefore have need to alter said order, the 
alteration of said order is accomplished by using said range eo 
measurement means and said range addition and removal 
means thereby creating a parallel maintenance program 
executed by said processing means for maintaining said 
parallel data-structure. 

19. A machine as recited in claim 16 wherein the range 65 
addition means divides an existing said range into sub 
ranges thereby creating said new ranges, and said distribu- 



tion means includes an efficient ordering scheme to redis- 
tribute said data contained in the existing range to the new 
range or ranges, thereby creating one or more new said 
global nodes. 

20. A machine as recited in claim 18 wherein said parallel 
maintenance program is adapted from a serial maintenance 
program for maintaining a serial data-structure, said parallel 
maintenance program creating said parallel data-structure 
and utilizing said ranges such that it functions as the adapted 
serial data-structure in a parallel environment. 

21. A machine as recited in claim 14 wherein said memory 
storage means comprises a plurality of memory storage units 
and wherein said individual nodes are distributed among 
said plurality of memory storage units and linked by said 
relation means to form a parallel data -structure. 

22. A machine as recited in claim 21 wherein said 
processing means comprises a plurality of processing 
elements, each said processing element containing a main- 
tenance program for controlling said memory storage 
means, each said processing element able to control one said 
memory storage unit at a time and able to cooperate with 
other said processing elements to control multiple said 
memory storage units using said maintenance program, the 
plurality of maintenance programs thereby combining to 
form a parallel maintenance program, 

whereby said parallel maintenance program controls and 
orders said data through control of said parallel data- 
structure. 

23. A machine as recited in claim 22 wherein said parallel 
maintenance program is adapted from a serial maintenance 
algorithm, said parallel maintenance program functioning 
through the use of said ranges, said ranges used as parallel 
embodiments of the data used in said serial maintenance 
algorithm. 

24. An article of manufacture for a computer system, said 
computer system comprising a memory means and process- 
ing means, said processing means comprising one or more 
processing elements, said processing elements able to access 
said memory means as one or more logically corresponding 
storage locations or memory units, said article controlling an 
ordering of data on said computer system through a parallel 
storage of said data defining a parallel data structure, said 
article comprising: 

a. range determination rules that enable said computer 
system to group said data into sets according to ranges 
of said data, said range determination rules able to 
define multiple said sets with equivalent said ranges, 

b. data storage entity definition rules that enable said 
computer system to define data storage entities that 
contain part of said data as defined by said range, 

c. parallel node definition rules that enable said computer 
system to define parallel nodes, said parallel nodes 
containing one or more said data storage entities, said 
parallel nodes defined by said ranges indicating the data 
values that said parallel node is able to contain, 

d. composite global node definition rules that define 
global nodes as composites of said parallel nodes, said 
global nodes comprising multiple said parallel nodes 
with a sufficient commonality in said ranges, said 
parallel nodes having said commonality in said ranges 
being therefore within the same said global node, said 
parallel nodes having a difference in said ranges 
between said parallel nodes being therefore within 
separate said global nodes where said difference pro- 
duces sufficient distinction between said sets, said 
parallel nodes within the same said global node stored 
on logically corresponding said memory units, 
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e. range relation rules that enable said computer system to and said difference to create a plurality of separate data 

logically relate said ranges and thereby relate said sets, structures stored separately on said memory units, each said 

said data and said parallel nodes, said range relation separate data structure identical in configuration to each 

rules determining said commonality and said other said separate data structure. 

difference, 5 30. An article as recited in claim 27 wherein the range 

whereby said ranges are logically related to each other addition is accomplished by splitting said old range into two 

thereby relating said sets, said parallel nodes, and the data or more said new ranges, said new ranges being equal to said 

values, and old range when combined, said new ranges defined such that 

whereby the relations between said parallel nodes create a each has an ordinal range relationship to each other, and 

plurality of serial data structures linked by the commonality 10 wherein said range relation rules relate the ranges in said 

of ranges that defines said global nodes, thus expressing said order to each other by said ordinal range relationship, 

ordering of data by creating the parallel or global data 31. An article as recited in claim 30 further including: 

structure as a composite of said serial data structures, and a , fi n d global node means by which said order is searched 

thus providing parallel and globalmeans to control the data f or a desired said range by using said range relation 

structures. 15 rules, 

25. An article as recited in claim 24 further including: b add t ' Q gk)bal node means that adds me data values tQ 

a. global node creation means that creates and defines said sa id global nodes, 

global node by grouping together said parallel nodes c femove from ^ Qode means mat removes the data 

that are related by said commonality in ranges, vahies frQm sakJ global QodeS) 

b. global node relation rules that utilize said range relation wherein locating the desired range allows access to a proper 
rules to logically relate said global nodes to each other, sa id global node to contain a given value of said data, and 

whereby said computer system is enabled to globally upon the locating, the data values are added to or removed 

manipulate said global nodes on said memory means. f r0 m the proper global node altering the global node con- 

26. An article as recited in claim 25 further including: tents as necessary, and said adjustment need rules determine 

a. adjustment need rules that determine a need for adjust- if the alteration of the global node contents results in said 
ment to said ranges to maintain said order for said data, need for adjustment, whereupon said ranges are adjusted and 

b. range adjustment rules that enable said computer sys- the relationships are altered as necessary. 

tern to adjust said ranges, changing the breadth of said 32 - An article as recited in claim 31 wherein the logical 

ranges, 30 relationships, said find global node means, the addition of 

wherein said range relation rules are used to adjust the new parallel nodes and the removal of old parallel nodes are 

logical relationships to appropriately relate the adjusted adapted parallel versions of a search algorithm, logical 

ranges relation rules, node or data addition rules and node or data 

whereby said processing means are enabled to alter a first removal rules of a serial data structure, 

expression of said order to produce a second expression of 35 whereby said parallel data structure is a parallel version of 

said order while maintaining the rules that define said order said serial data structure created and maintained in a parallel 

for both of the expressions, and or distributed environment, said parallel data structure 

whereby changing the data organized in said order results in accomplishing the same goals as said serial data structure, 

a change in a given expression of said order while main- 33 An article as recited in claim 31 wherein said ranges 

taining the rules that define said order. 40 are non overlapping ranges that relate to each other in the 

27. An article as recited in claim 26 wherein said range same fashion as the data values properly contained in said 
relation rules further define said commonality to produce ranges such that said range relation rules express the simi- 
equivalent comparison results indicating said commonality l arit y in relationships to create said parallel data structure, 
when comparing one said parallel node within a given said 3 4. An article as recited in claim 24 wherein said range 
global node to any of said parallel nodes within the same 45 determination rules and said range relation rules create the 
said global node, and further define said differences to r a °ge definitions with a wide variety of uses, the range 
produce equivalent comparison results indicating said dif- definitions creating complex ranges, said complex ranges 
ferences when comparing one said parallel node within a defining complex sets, said complex ranges calculated using 
given said global node to any of said parallel nodes within said data in such a way that a given value of said data can 
a separate said global node, and wherein said range adjust- 50 have membership in multiple said complex sets, said mul- 
ment rules contain range addition and removal rules to add tiple complex sets intersecting each other where said mul- 
new said ranges to said order and remove old said ranges tiple complex sets contain the data value with membership 
from said order, adding new said parallel nodes and remov- in said multiple complex sets. 

ing old said parallel nodes as necessary, and adjusting the 35. An article as recited in claim 34 wherein said complex 

logical relationships as necessary. 55 ranges are used to create said parallel data structure such that 

28. An article as recited in claim 26 wherein said com- it has at least two dimensions, and wherein each said 
puter system defines said order by using said commonality complex set is organized for access by a different aspect of 
and said difference to create a parallel expression of a serial the data values stored in said parallel data structure. 

data structure with its own rules of ordering, thus defining 36. An article as recited in claim 35 wherein said parallel 
said parallel data structure, said parallel data structure com- 60 nodes related by said commonality in ranges are said com- 
prising a plurality of separate said serial data structures plex sets of said parallel nodes, and wherein each said 
related to each other by said commonality in ranges and complex set of parallel nodes creates a complex said global 
configured by the rules of ordering said serial data struc- node, said complex global nodes related to each other by 
tures. said range relation rules. 

29. An article as recited in claim 26 wherein said com- 
puter system defines said order by using said commonality ***** 
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